Sei sulla pagina 1di 55

Chapter 3

An Introduction to Stochastic Differential


Equations
3.1 Fundamentals of Probability
In this section we review some basic concepts of probability theory and illustrate
them with various examples. The review includes among other things the concepts
of sample space, -eld, events, probability measure, probability space, random
variable, expectation, convergence of sequences of random variables, and condi-
tional expectation.
3.1.1 Probability and Random Variables
The mathematical model for a random quantity is a random variable. Prior to giv-
ing a precise denition of this, we rst recall some basic concepts from general
probability theory.
Denition 3.1. A collection of subsets of a set is called a -eld or -algebra,
denoted by F, if it satises the following three properties:
(i) F;
(ii) if A F, then A
c
F (F is closed under complementation);
(iii) if A
1
, A
2
, . . . F, then

_
i=1
A
i
F (F is closed under countable unions).
The pair (, F) is then called a measurable space.
Remark 3.1. Note that property (iii) also tells us that F is closed under countable
intersections. Indeed, if A
1
, A
2
, . . . F, then A
c
1
, A
c
2
, . . . F by property (ii), and
therefore

_
i=1
A
c
i
F. However, using DeMorgans law, we have
_
_
i=1
A
c
i
_
c
=

i=1
A
i
.
P.H. Bezandry and T. Diagana, Almost Periodic Stochastic Processes, 61
DOI 10.1007/978-1-4419-9476-9_3, Springer Science+Business Media, LLC 2011
62 3 An Introduction to Stochastic Differential Equations
Thus, again by property (ii),

i=1
A
i
F.
Here are some elementary -elds on the set .
1. F
1
=
_
/ 0,
_
.
2. F
2
=
_
/ 0, , A, A
c
_
for some A ,= / 0 and A ,=.
3. F
3
=P() =
_
all subsets of , including itself
_
.
Remark 3.2. In general, if is uncountable, it is not an easy task to describe F
3
because it is simply too big, as it contains all possible subsets of . However, F
3
can be chosen to contain any set of interest.
Now, let U be a collection of subsets of and dene
(U ) =

GU
G ,
where the Gs are -elds on .
Then (U ) is a unique -eld, called the -eld generated by U . There is no
-eld smaller than (U ) that includes U . The Borel -eld is generated by the
collection of open sets of a topological space. The elements of this -eld are called
Borel sets. For instance, the Borel -eld on R is generated by the intervals in R
and is denoted by B(R).
Denition 3.2. A probability measure on a measurable space (, F) is a set func-
tion P : F [0, 1] with the properties
(i) P() = 1;
(ii) if A
1
, A
2
, . . . F and (A
i
)

i=1
is disjoint (i.e., A
i

A
j
= if i ,= j), then
P
_
_
i=1
A
i
_
=

i=1
P(A
i
).
The triple (, F, P) is then called a probability space. The subsets A of which
are elements of F are called F-measurable sets. In a probability context, these sets
are called events and we interpret P(A) as the probability that the event A occurs.
Note that if P(A) = 1, we say that A occurs almost surely (a.s.).
Example 3.1. Let = [0, 1], F =B([0, 1]), the Borel -eld on [0, 1], and P =,
the Lebesgue measure on [0, 1]. In this case, the open intervals of the form (a, b),
where 0 < a < b < 1, could be taken as the generator sets, and
_
(a, b)
_
= b a.
Hence, the triple
_
[0, 1], B([0, 1]),
_
is a probability space.
Denition 3.3. A probability space (, F, P) is said to be complete if for every
A B such that B F and P(B) = 0, then A F.
We assume throughout the book that all probability spaces are complete.
3.1 Fundamentals of Probability 63
3.1.2 Sequence of Events
For a sequence of events A
i
F, i = 1, 2, . . . on this space, dene the limit superior
limsup
i
A
i
=

j=1

_
i=j
A
i
=
_
: A
i
for innitely many i
/
s
_
=
_
: A
i
for innitely often
_
=
_
: A
i
, i.o.
_
.
Similarly, we dene the limit inferior
liminf
i
A
i
=

_
j=1

i=j
A
i
=
_
: A
i
for all but nitely many i
/
s
_
.
It is not difcult to show that limsup
i
A
i
and liminf
i
A
i
belong to F and that
liminf
i
A
i
limsup
i
A
i
.
If liminf
i
A
i
= limsup
i
A
i
, the sequence (A
i
) is said to be convergent with limit A,
where A = lim
i
A
i
.
The following lemma due to BorelCantelli is extremely useful for the deriva-
tions of many limit theorems of probability theory.
Lemma 3.1. (i) If (A
i
) is a sequence of arbitrary events and

i=1
P(A
i
) <, then
P
_
limsup
i
A
i
_
= 0.
(ii) If (A
i
) is a sequence of independent events satisfying

i=1
P(A
i
) =, then
P
_
limsup
i
A
i
_
= 1.
Proof. (i) Note rst that limsup
i
A
i

_
i=j
A
i
. Now, using the monotonicity and sub-
additivity of P, we have
P
_
limsup
i
A
i
_
P
_

_
i=j
A
i
_

i=j
P(A
i
), j = 1, 2, . . . ,
64 3 An Introduction to Stochastic Differential Equations
which implies that the extreme right term in these inequalities tends to zero as j
since the series

i=1
P(A
i
) is assumed to be convergent. Hence, P
_
limsup
i
A
i
_
= 0.
(ii) Note that
1P
_
limsup
i
A
i
_
= P
__
limsup
i
A
i
_
c
_
= P
_
liminf
i
A
c
i
_
= lim
j
P
_

i=j
A
c
i
_
.
Since the A
i
s are independent, we have that for every i,
P
_

i=j
A
c
i
_
= lim
n
P
_
n

i=j
A
c
i
_
= lim
n
n

i=j
P
_
A
c
i
_
= lim
n
n

i=j
(1P
_
A
i
_
lim
n
n

i=j
exp
_
P(A
i
)

= lim
n
exp
_

i=j
P(A
i
)

.
Now, using the fact that the series

i=1
P(A
i
) is divergent, we can conclude that the
right-hand side of the last inequality approaches zero. This completes the proof.
Denition 3.4. A function Y : R is called F-measurable if
Y
1
(U) :=
_
: Y() U
_
F
for all U B(R).
We are now prepared to give a precise denition of a random variable.
Denition 3.5. An R-valued random variable X is an F-measurable function X :
R. Every random variable X induces a probability measure
X
on R, dened
by

X
(B) = P(X
1
(B)), B B(R) .

X
is called the distribution of X.
Denition 3.6. Suppose that X is a random variable with
_

X()

dP() < .
Then the expectation of X is the number
3.1 Fundamentals of Probability 65
E
_
X

:=
_

X() dP() =
_
R
x d
X
(x) .
Here are some standard inequalities which will frequently be used throughout
this book.
Proposition 3.1. (i) The Markov inequality:
If h : R (0, ) is a strictly positive, even function that increases in (0, ) and
E[h(X)] <, then
P
_

> a
_

E
_
h(X)

h(a)
, a > 0.
(ii) The Chebyshev Inequality:
P
_

X EX

> a
_

Var
_
X

a
2
, a > 0.
(iii) The CauchySchwarz Inequality:
E

XY

_
E[X
2
]
_
1/2
_
E[Y
2
]
_
1/2
.
(iv) The H older Inequality:
If 1 < p < and q is given by 1/p+1/q = 1, E[X[
p
<, and E[X[
q
<, then
E

XY


_
E[X[
p
_
1/p
_
E[X[
q
_
1/q
.
(v) The Jensen Inequality:
Let f be a convex function on R. If E[X[ and E[ f (X)[ are nite, then
f
_
E[X]
_
E
_
f (X)

.
Proof. (i) For any a > 0, we have
Eh(X) =
_

h(X) dP
_
[X[a
h(X) dP h(a)P
_
[X[ a
_
,
which implies the desired result.
(ii) This property can be obtained from part (i) by replacing X by X E[X] and
taking h(x) = x
2
.
(iii) & (iv) Property (iv) is an extension of the H older Inequality in the context
of probability. Its proof is almost identical to that of Proposition 1.2 and may be
omitted. As to property (iii), it is a particular case of (iv) with p = q = 2.
(v) To establish this property, let l(x) be a tangent line to f (x) at the point x =
E[X]. Write l(x) = ax +b for some a and b. Now, by the convexity of f we have
f (x) ax +b. We then have
66 3 An Introduction to Stochastic Differential Equations
E
_
f (E[X])

E[aX +b]
= aE[X] +b = l(E[X])
= f (E[X]) .
The latter identity is true since l is tangent at the point to f (x) at E[X]. This com-
pletes the proof.
3.1.3 Convergence of Random Variables
Let X and X
n
, n = 1, 2, . . . be real-valued random variables dened on a probability
space (, F, P). The convergence of the sequence (X
n
) toward X has various de-
nitions depending on the way in which the difference between X
n
and X is evaluated.
In this subsection, we discuss the following modes of convergence: convergence in
distribution, convergence in probability, almost sure convergence, and L
p
conver-
gence.
3.1.3.1 Convergence in Distribution
Denition 3.7. The sequence (X
n
) converges in distribution to X if for all continuity
points x of distribution F
X
,
lim
n
F
X
n
(x) = F
X
(x) .
Here, F
X
n
, F
X
are the cumulative distribution functions of X
n
and X, respectively.
Among the important characterizations of convergence in distribution is the fol-
lowing.
Proposition 3.2. The sequence (X
n
) converges in distribution to the random vari-
able X if and only if for all bounded, continuous functions f ,
lim
n
E
_
f (X
n
)

= E
_
f (X)

.
It is well known that convergence in distribution is equivalent to pointwise con-
vergence of the corresponding characteristic functions:
(X
n
) converges in distribution to X if and only if lim
n
E
_
e
i tX
n

= E
_
e
i tX

.
Also, note that although we talk of a sequence of random variables converging
in distribution, it is really the distributions of those random variables that converge,
not the random variables themselves.
3.1 Fundamentals of Probability 67
3.1.3.2 Convergence in Probability
Denition 3.8. The sequence (X
n
) converges in probability to the random variable
X if for any > 0,
lim
n
P
_

X
n
X

>
_
= 0.
Example 3.2. (Convergence in distribution, not in probability) Consider a sequence
(X
n
)
n0
of independent randomvariables dened on the probability space (, F, P)
taking the values one and zero with probabilities P(X
n
= 1) = P(X
n
= 0) =
1
2
. This
sequence converges to X
0
in distribution but does not converge in probability to X
0
.
To see this, let us compute the cumulative density function of X
n
. We have
F
X
n
(t) =
_

_
0 if t 0,
1
2
if 0 <t 1,
1 if t > 1.
Clearly, F
X
n
(t) = F
X
0
(t) for all n and t. Therefore, (X
n
) converges in distribution to
X
0
. However, for n ,= 0, note that
X
n
X
0
=
_

_
1 with probability
1
4
,
0 with probability
1
2
,
1 with probability
1
4
.
Let us now compute P
_

X
n
X
0

>
1
2
_
. We then have
P
_

X
n
X
0

>
1
2
_
= P
_
(X
n
X
0
<
1
2
) (X
n
X
0
>
1
2
)
_
= P
_
(X
n
X
0
=1) (X
n
X
0
= 1)
_
= P
_
(X
n
X
0
=1)
_
+P
_
(X
n
X
0
= 1)
_
=
1
4
+
1
4
=
1
2
.
Now, take =
1
2
. We obtain
lim
n
P
_

X
n
X
0

>
1
2
_
=
1
2
,= 0.
Hence, (X
n
) does not converge in probability to X
0
.
The above example shows that the convergence in distribution does not imply the
convergence in probability. However, we can establish the following.
68 3 An Introduction to Stochastic Differential Equations
Proposition 3.3. If a sequence (X
n
) converges in probability to X, then it converges
in distribution to X.
Proof. Let t be a point of continuity of F. Then,
P
_
X t
_
= P
_
X t , X
n
t
_
+P
_
X t , X
n
>t
_
P
_
X
n
t
_
+P
_

X
n
X

>
_
.
Similarly,
P
_
X
n
t
_
= P
_
X
n
t, X >t +
_
+P
_
X
n
t, X t +
_
P
_

X
n
X

>
_
+P
_
X t +
_
.
Since F is continuous at t, we have
P
_
X t +
_
F(t) +
and
P
_
X t
_
F(t) .
Combining these inequalities, we obtain
F(t) P
_

X
n
X

>
_
P
_
X
n
t
_
F(t) + +P
_

X
n
X

>
_
.
Now, letting n and using the fact that X
n
X in probability, we obtain
F(t) lim
n
P
_
X
n
t
_
F(t) + ,
which implies the desired result.
The following proposition shows that the converse of Proposition 3.3 is true if X
is degenerate.
Proposition 3.4. If (X
n
) converges to the constant in distribution, then (X
n
) con-
verges to in probability.
Proof. Suppose that (X
n
) converges to in distribution. Then, we have
F
X
n
(t) = P(X
n
t) 0 for all t < (3.1)
and
F
X
n
(t) = P(X
n
t) 1 for all t > . (3.2)
It follows from (3.1) and (3.2) that for any > 0, one can nd n
0
and n
1
such that
P(X
n
) <

2
for any n n
0
and
3.1 Fundamentals of Probability 69
P(X
n
> +) <

2
for any n n
1
.
Now, let N = max(n
0
, n
1
). Then, for any n N,
P
_

X
n

>
_
= P
_
X
n
<
_
+P
_
X
n
>
_
= P
_
X
n
<
_
+P
_
X
n
> +
_
<

2
+

2
= .
Thus, (X
n
) converges to in probability.
3.1.3.3 Almost Sure Convergence
Denition 3.9. The sequence (X
n
) converges almost surely (a.s.) to the random vari-
able X if
P
_
: lim
n
X
n
() = X()
_
= 1.
Example 3.3. Let sample space be the closed unit interval [0, 1] with the uniform
probability distribution . Dene random variables X
n
() =+
n
and X() =.
For every [0, 1),
n
0 as n and X
n
() . However, since X
n
(1) = 2
for every n, X
n
(1) does not converge to 1 = X(1). But, since the convergence occurs
on the set [0, 1) and (1) = 0, (X
n
) converges to X almost surely.
Note that the almost sure convergence has some equivalent denitions. For in-
stance, it can be easily shown that X
n
X a.s. if and only if for any > 0 we
have
lim
m
P
_
[X
n
X[ > for some n m
_
= 0. (3.3)
The following proposition provides an important sufcient condition for almost
sure convergence.
Proposition 3.5. If

n=1
P
_
[X
n
X[ >
_
< for every >0, then the sequence (X
n
)
converges to X almost surely.
Proof. Let E
n
() =
_
[X
n
X[
_
. Then, by assumption, the series

n=1
P(E
n
())
is convergent. By Proposition 3.1(i), P
_
limsup
n
E
n
()
_
= 0 for each > 0, which
implies the desired result.
Using property (3.3), we can now establish the following.
Proposition 3.6. If a sequence (X
n
) converges almost surely to X, then it converges
in probability to X.
70 3 An Introduction to Stochastic Differential Equations
Proof. Let E
i
(k) =
_
[X
i
X[
1
k
_
. Then A
nk
=

_
i=n
E
i
(k) is the set of those such
that [X
i
() X()[
1
k
for all i n, and observe that
P
_
A
nk
_
P
_
E
n
(k)
_
for all n, k 1.
Now, since X
n
X a.s., P
_
A
nk
_
0 for all k. The latter property shows that
P
_
E
n
(k)
_
0 for all k, which implies the desired result.
The converse of Proposition 3.6 is false. However, we can establish the following
proposition.
Proposition 3.7. If (X
n
) converges in probability to X, there exists a suitable subse-
quence (n
k
) such that (X
n
k
) converges almost surely to X.
Proof. Pick an increasing sequence (n
k
) such that
P
_
[X
n
k
X[ >
1
k
_

1
k
2
.
This can be done since X
n
X in probability. Then, for any > 0, we have

k=1
P
_
[X
n
k
X[ >
_


k: k<
1
P
_
[X
n
k
X[ >
_
+

k: k
1
P
_
[X
n
k
X[ >
1
k
_


k: k<
1
P
_
[X
n
k
X[ >
_
+

k=1
1
k
2
<.
Consequently, X
n
k
X a.s. as k by Proposition 3.5.
The following proposition extends some properties of algebraic operations on
convergent sequences of real numbers to sequences of random variables.
Theorem 3.1. (Slutskys Theorem) If X
n
converges in distribution to X and Y
n
con-
verges to a, a constant, in probability, then
(a) X
n
+Y
n
converges X +a in distribution.
(b) Y
n
X
n
converges to aX in distribution.
Proof. (a) We may assume that a = 0. Let x be a continuity point of the cumulative
distribution function F
X
of X. We then have
P
_
X
n
+Y
n
x
_
P
_
X
n
+Y
n
x, [Y
n
[
_
+P
_
X
n
+Y
n
x, [Y
n
[ >
_
P
_
X
n
x +
_
+P
_
[Y
n
[ >
_
.
Similarly,
P
_
X
n
x
_
P
_
X
n
+Y
n
x
_
+P
_
[Y
n
[ >
_
.
Hence,
P
_
X
n
x
_
P
_
[Y
n
[ >
_
P
_
X
n
+Y
n
x
_
P
_
X
n
x +
_
+P
_
[Y
n
[ >
_
.
3.1 Fundamentals of Probability 71
Letting n and then 0 proves (a).
(b) To prove this property, we use Proposition 3.2. We prove that
E
_
f (X
n
Y
n
)

E
_
f (aX)

for every bounded continuous function f .


Let M = sup
x
[ f (x)[ <, x > 0, and choose K such that K are continuity points
of the cumulative distribution function F
X
and P
_
[X[ > K
_
<

16
, which implies that
P
_
[X
n
[ >K
_
<

8
for all sufciently large values of n. Then, one can nd >0 such
that [ f (x) f (y)[ <

4
whenever [xy[ <. Also, the convergence in probability of
the sequence (Y
n
) toward a constant a allows us to choose N
0
> 0 such that
P
_
[Y
n
a[ >

K
_
<

8M
whenever n N
0
. We then have

E
_
f (X
n
Y
n
)

E
_
f (aX)

E
_

f (X
n
Y
n
) f (aX
n
)

; [Y
n
a[ >

K
_
+E
_

f (X
n
Y
n
) f (aX
n
)

;
_
[Y
n
a[

K
, [X
n
[ > K
_
_
+E
_

f (X
n
Y
n
) f (aX
n
)

;
_
[Y
n
a[

K
, [X
n
[ K
_
_
+

E
_
f (aX
n
)

E
_
f (aX)

2MP
_
[Y
n
a[ >

K
_
+2MP
_
[X
n
[ > K
_
+

4
+

E
_
f (aX
n
)

E
_
f (aX)

.
On the other hand, we can show that the sequence (aX
n
) converges to aX in distri-
bution. Indeed, take a bounded and continuous function h(x) = f (ax) and use the
fact that the sequence (X
n
) converges X in distribution. It follows that
E
_
f (aX
n
)

= E
_
h(X
n
)

E
_
h(X)] = E
_
f (aX)

.
The latter allows us to choose N
1
> 0 such that

E
_
f (aX
n
)

E
_
f (aX)

<

4
whenever n N
1
.
Now, take N = max(N
0
, N
1
). For any n N, we obtain

E
_
f (X
n
Y
n
)

E
_
f (aX)


2
8
+
2
8
+

4
+

4
= ,
as desired.
72 3 An Introduction to Stochastic Differential Equations
The following proposition due to Skorohod relates convergence in distribution
and almost sure convergence.
Proposition 3.8. (Skorohod Representation Theorem) Let (X
n
) be a sequence of
randomvariables, and assume that (X
n
) converges to X in distribution as n . Let
F
n
be the cumulative distribution function of X
n
and let F be the cumulative distri-
bution function of X. Then, there exists a probability space (
/
, F
/
, P
/
) and random
variables Y
n
and Y all dened on (
/
, F
/
, P
/
) such that Y has cumulative distribu-
tion F and each F
n
has cumulative distribution function F
n
, and (Y
n
) converges to Y
almost surely as n .
Proof. For a proof, see. e.g., Billingsley [29].
3.1.3.4 L
p
-Convergence
Denition 3.10. Let p 1. The sequence (X
n
) converges in L
p
to the random vari-
able X if E

X
n

p
+E

p
< for all n and
lim
n
E

X
n
X

p
= 0.
By Markovs inequality, P
_
:

X
n
() X()

>
_

p
E

X
n
X

p
for any
> 0. Thus, if (X
n
) converges in L
p
to X, then (X
n
) converges in probability to X.
The converse is in general false.
Example 3.4. (Convergence in probability, not in L
p
)
Let ([0, 1], B([0, 1]), P) be a probability space with P(d) = d), the uniform
probability distribution on [0, 1], and let X
n
= 2
n
1
(0,
1
n
)
be a sequence of random
variables dened on this space. The sequence X
n
converges in probability to zero as
n but does not converge in L
p
, p 1.
To see this, x > 0. Then we have
P
_
[X
n
[ >
_
= P(0,
1
n
) =
1
n
0 a.s. n .
Hence, (X
n
) converges in probability to zero. On the other hand,
E[X
n
[
p
= 2
np
P(0,
1
n
) =
2
np
n
a.s. n .
Hence, convergence in probability does not imply L
p
-convergence.
Example 3.5. (Convergence in L
p
, not almost surely) Let sample space be the
closed unit interval [0, 1] with Lebesgue measure . Dene a sequence (X
n
) of
random variables as follows: X
1
= 1
[0,
1
2
]
, X
2
= 1
[
1
2
,1]
, X
3
= 1
[0,
1
3
]
, X
4
= 1
[
1
3
,
2
3
]
,
3.1 Fundamentals of Probability 73
X
5
= 1
[
2
3
,1]
, and so on. We claim that this sequence converges to zero in L
p
but
does not converge to zero almost surely.
To see this, let us compute the pth moment of some randomvariables and observe
the pattern. We have
E

X
1

p
= E

X
2

p
=
1
2
, E

X
3

p
= E

X
4

p
= E

X
5

p
=
1
3
, . . .
so that E

X
n

p
0 as n . In addition, by Chebyshevs inequality, the latter
implies that (X
n
) converges in probability to 0. However, (X
n
) does not converge to
0 almost surely. Indeed, there is no value of [0, 1] for which X
n
() 0. For
every [0, 1], the value of X
n
() alternates between 0 and 1 innitely often. No
pointwise convergence occurs for this sequence.
Remark 3.3. For proofs of the various results discussed in this subsection, we refer
the reader to for instance Bauer [17], Casella and Berger [34], or M etivier [140].
3.1.4 Conditional Expectation
Let X be an integrable random variable dened on a probability space (, F, P)
and G denote a sub--eld of F.
Denition 3.11. The conditional expectation E
_
X [ G

of X with respect to G is
dened to be the class of G-measurable functions satisfying
_
A
X dP =
_
A
E
_
X [ G

dP, A G . (3.4)
It is important to note that the random variable E
_
X [ G

can be understood as an
updated version of the expectation of X, given the information F.
We list here some properties of the conditional expectation E
_
X [ G

that are
frequently used in calculations.
Proposition 3.9. (i) Expectation Law: E
_
E
_
X [ G

_
= E
_
X
_
.
(ii) If X is G-measurable, then E
_
X [ G

= X a.s.
(iii) Stability: If Y is G-measurable and bounded, then
E
_
XY [ G

=YE
_
X [ G

a.s. Y G .
(iv) Independence Law: If X and the -eld G are independent, then E
_
X [ G

=
E[X].
(v) E
_
_
X E
_
X [ G
_
Y
_
= 0, Y G.
(vi) The conditional expectation E
_
X [ G

is the projection of X on G and


X E
_
X [ G

is orthogonal to G. In other words, E


_
X [ G

is the G-measurable
random variable that is closest to X in the mean square sense.
74 3 An Introduction to Stochastic Differential Equations
Proof. (i) This property follows immediately from (3.4) with A =.
(ii) This property follows from the fact that X is G-measurable and E[X1
A
] =
E[X1
A
] for all A G.
(iii) To prove this property, we show that YE
_
X[G

is a version of E
_
XY[G

.
The G-measurability of YE
_
X[G

follows from that of Y and E


_
X[G

. It remains to
show that
_
A
YE
_
X[G

dP =
_
A
YX dP, A G . (3.5)
If Y = 1
G
, G G, then for any H G, one has
_
H
E
_
X1
G
[G

dP =
_
H
X1
G
dP =
_
HG
X dP
and
_
H
1
G
E
_
X[G

dP =
_
GH
E
_
X[G

dP =
_
GH
X dP.
Hence, (3.5) holds for this case. One can also show that (3.5) holds for a simple
random variable Y =

n
1
A
n
, A
n
G by linearity of expectation. The extension
to any random variable follows immediately from the representation of Y by a dif-
ference of two positive random variables, which can be dened as limits of simple
random variables.
(iv) Let A G. Then by independence
_
A
X dP =
_
1
A
X dP = E
_
1
A
X] = E[1
A
]E[X] =
_
A
E[X] dP
from which the property follows.
(v) The proof of this property is similar to that of property (iii). It is left to the
reader as an exercise.
(vi) This property is a straight consequence of property (v).
We now collect essential properties of the conditional expectation that are similar
to the properties of the expectation operator.
Proposition 3.10. If X and X
n
are integrable random variables, then
(i)Linearity: E
_
aX
1
+bX
2
[ G
_
= aE
_
X
1
[ G

+bE
_
X
2
[ G

a.s.
(ii) Positivity: X 0 implies E
_
X [ G

0 a.s.
(iii) Monotonicity: X
1
X
2
implies E
_
X
1
[ G

E
_
X
2
[ G

a.s.
(iv) Monotone convergence: if X
n
X a.s., then E
_
X
n
[ G

E
_
X [ G

a.s.
(v)Dominated convergence: [X
n
[ Y, E[Y] <, and X
n
X a.s. imply
E
_
X
n
[ G

E
_
X [ G

.
(vi) CauchySchwarz inequality:
_
E
_
XY [ G

_
2
E
_
X
2
[ G

E
_
Y
2
[ G

a.s.
(vii) Jensen inequality:
_
E
_
X [ G
_
E
_
(X)[ G

a.s. for a convex function .


(viii) Modulus inequality:

E
_
X [ G

E
_

[ G

a.s.
3.1 Fundamentals of Probability 75
Proof. (i) This property follows immediately from the linearity of the integral.
(ii) To prove this, let X 0. Then, for every A G, we have
_
A
E
_
X[G

dP =
_
A
X dP 0
so that E
_
X[G

0 a.s.
(iii) This property follows immediately from (ii).
(iv) By monotonicity, there is a G-measurable random variable Y such that
E
_
X
n
[G

Y. Let A G. Using the Lebesgue Monotone Convergence Theorem,


one has
E
_
Y1
A

= lim
n
E
_
E
_
X
n
[G

1
A

= lim
n
E
_
X
n
1
A

= E
_
X1
A

,
which proves that Y = E
_
X[G

.
(v) To prove this property, let Y
n
= sup
kn
[X
k
X[, n 1. Then Y
n
0, Y
n
as
n and Y
n
2Z almost surely for n 1 so that Y
n
is integrable for each n 1.
Also, since X
n
converges to X almost surely, Y
n
converges to 0 almost surely. On the
other hand, we have

E
_
X
n
[G
_
E
_
X[G
_

E
_
[X
n
X[[G

E[Y
n
] a.s.
Thus, it is sufcient to show that lim
n
E
_
Y
n
[G

= 0 a.s. From the fact that Y


n
0 and
Y
n
it follows that E
_
Y
n
[G

0 and E
_
Y
n
[G

and hence V = lim


n
E
_
Y
n
[G

exists
and E[V] E
_
E
_
Y
n
[G

= E[Y
n
]. But lim
n
Y
n
= 0 a.s. and Y
n
2Z so that by the
Dominated Convergence Theorem lim
n
E[Y
n
] = 0.
(vi) Dene the random variables
U =
_
E
_
[X[
2
[G
_
1/2
, V =
_
E
_
[Y[
2
[G
_
1/2
and note that they are G-measurable. Observe
E
_
[X[
2
1
U=0

= E
_
1
U=0
E
_
[X[
2

[G

= E
_
1
U=0
U
2

= 0.
Thus, [X[1
U=0
= 0 a.s., which implies that
E
_
[XY[[G

1
U=0
= E
_
[XY[ 1
U=0
[G

= 0.
Similarly, we can also show that E
_
[XY[[G

1
V=0
= 0. Therefore, the conditional
CauchySchwarz holds on the set U = 0V = 0.
On the set
_
U =, V > 0
_

_
U > 0, V =
_
, the right-hand side is innite and
the conditional CauchySchwarz inequality holds too. Dividing by the right-hand
side, it is then enough to show that
E
_
[XY[[G

UV
1
H
1 a.s. on the set H :=
_
0 <
U <, 0 <V <
_
.
76 3 An Introduction to Stochastic Differential Equations
To prove this, let G G, G H. Using the measurability of U, V, and 1
G
with
respect to G, the properties of the conditional expectation, and classical Cauchy
Schwarz inequality, we have
E
_
E
_
[XY[ [G

UV
1
G
_
= E
_
E
_
[XY[
UV
1
G

G
__
= E
_
[X[
U
1
G

[Y[
V
1
G
_

_
E
_
[X[
2
U
2
1
G
__
1/2_
E
_
[Y[
2
V
2
1
G
__
1/2

_
E
_
E
_
[X[
2
[G

U
2
1
G
__
1/2
_
E
_
E
_
[Y[
2
[G

V
2
1
G
__
1/2
=
_
E[1
G
]
_
1/2
_
E[1
G
]
_
1/2
= E[1
G
] ,
which implies the desired result.
(vii) To prove this property, we use a classical characterization of a convex func-
tion, namely, every convex function is the upper envelope of a countable col-
lection of such lines: a
n
x +b
n
, n 1. Dene L
n
(x) = a
n
x +b
n
, for all x. We then
have
L
n
_
X[G
_
= E
_
L
n
(X)[G

E
_
(X)[G

and thus

_
E
_
X[G
_
= sup
n
L
n
_
E
_
X[G
_
E
_
(X)[G

.
(viii) The proof of this property is left to the reader as an exercise.
3.2 Stochastic Processes
In recent years there has been an ever-increasing interest in the study of systems
which evolve in time in a random manner. Mathematical models of such systems
are known as stochastic processes.
More precisely, let X be the random variable of interest depending on a parameter
t, which assumes values from a set T [0, ). In many applications, the parameter
t is considered to be time. Thus, X(t) is the state of random variable X at time
t, whereas T denotes the time set. Furthermore, let S denote the set of all states
(realizations) which the X(t), t T, can assume.
Denition 3.12. A stochastic process X with parameter set T and state space R is a
collection of R-valued random variables
_
X(t), t T
_
=
_
X(, t), , t T
_
dened on the probability space (, F, P).
3.2 Stochastic Processes 77
Note that for each xed t T we have a random variable
X(, t) .
On the other hand, for each xed we have a function
t X(, t) ,
which is called a sample path of the process. The stochastic process X may be
regarded as a function of two variables (, t) from T to R.
If T is a nite or countably innite set, then
_
X(t), t T
_
is called a discrete-
time stochastic process. Such processes can be written as a sequence of random
variables (X
t
). Conversely, every sequence of random variables can be interpreted
as a discrete-time stochastic process. If T is an interval, then
_
X(t), t T
_
is a
continuous-time stochastic process. The stochastic process
_
X(t), t T
_
is said to
be discrete if its state space S is a nite or countably innite set. It is said to be
continuous if S is an interval.
This section introduces three types of stochastic processes, Brownian motion,
Gaussian processes, and martingales, that play a central role in the theory of stochas-
tic processes.
3.2.1 Continuity
In this subsection, we give some of the most common denitions of continuity for
stochastic processes.
Let
_
X(t), t T
_
be an R-valued stochastic process on a complete probability
space (, F, P).
Denition 3.13. (i) X is continuous in probability at t T if for any > 0,
lim
st
P
_

X(, s) X(, t)

>
_
= 0.
(ii) X is continuous in the p-th mean at t T if
lim
st
E
_

X(s) X(t)

p
_
= 0. (3.6)
(iii) X is almost sure (a.s.) continuous at t T if
P
_
: lim
st

X(, s) X(, t)

= 0
_
= 1. (3.7)
Remark 3.4. (i) In Denition 3.13(iii), Eq. (3.7) is equivalent to
P
_
: lim
st
X(, s) ,= X(, t)
_
= 0.
78 3 An Introduction to Stochastic Differential Equations
(ii) If p = 2 in Eq. (3.6), X is said to be continuous in the mean-square sense at
t. The p-th mean continuity is used extensively later in the following chapters.
The stochastic process X is continuous in probability, continuous in the p-th
mean, and almost surely continuous in an interval I T if it is continuous in prob-
ability, continuous in the p-th mean, and almost surely continuous at each t I,
respectively.
Denition 3.14. Two stochastic processes X and Y with a common index T R
+
are called versions of one another if for all t T,
P
_
: X(, t) =Y(, t)
_
= 1.
Such processes are also said to be stochastically equivalent.
Proposition 3.11. If X and Y are versions of one another, they have the same nite-
dimensional distributions.
Proof. Let I be an arbitrary nite collection of indices. It sufces to show that P
_
:
X
I
() =Y
I
()
_
= 1. For this purpose let I =
_
t
j
, 1 j i
_
. Using additivity of P,
we have
P
_
: X
I
() =Y
I
()
_
= P
_
: X(, t
1
) =Y(, t
1
), . . . , X(, t
i
) =Y(, t
i
)
_
= 1P
_ i
_
j=1
_
: X(, t
j
) ,=Y(, t
j
)
__
1
i

j=1
P
_
X(, t
j
) ,=Y(, t
j
)
_
= 1.
There is a stronger notion of similarity between processes than that of versions,
which is sometimes useful in applications.
Denition 3.15. Two stochastic processes X and Y are indistinguishable if their
sample paths coincide almost surely, that is,
P
_
: t T, X(, t) =Y(, t)
_
= 1.
In the following example, we describe stochastic processes that are versions of
one another, but not indistinguishable.
Example 3.6. Let X =
_
X(t), 0 t 1 and Y =
_
Y(t), 0 t 1 be real-valued
stochastic processes dened on the probability space
_
[0, 1], B([0, 1]),
_
, where
is the Lebesgue measure on [0, 1], such that X(, t) = 0 and
3.2 Stochastic Processes 79
Y(, t) =
_
_
_
1 if =t,
0 if ,=t.
Note that for each [0, 1] xed,
sup
0t1
X(, t) = 0 while sup
0t1
Y(, t) = 1.
It follows that the sample paths of X and Y differ for [0, 1]. Therefore, they are
not indistinguishable.
On the other hand, for each t [0, 1] xed, let
t
=
_
: X(, t) ,=Y(, t)
_
=
_
t
_
.
Then, we have
__
t
__
= 0, which means that the processes X and Y are versions of
one another.
We now state a famous theorem of Kolmogorov.
Theorem 3.2. Suppose that the process X =
_
X(t), t T
_
satises the following
condition: for all T > 0 there exist positive constants , , and C such that
E

X(t) X(s)

t s

1+
for 0 s, t T.
Then there exists a continuous version of X.
For a proof, see, e.g., Strook and Varadhan [168] or Bakstein and Capasso [15].
3.2.2 Separability and Measurability
Let X =
_
X(t), 0 t 1
_
be a stochastic process. In general, sup
0t1
X(t) does not
dene a random variable. For instance, take = [0, 1], F =B([0, 1]), and P =,
the Lebesgue measure on [0, 1]. Let A [0, 1] be a nonmeasurable set and dene a
stochastic process by
X(, t) =
_
_
_
1 if t A and =t,
0 if otherwise.
Then the function sup
0t1
X(, t) is given by
sup
0t1
X(, t) =
_
_
_
1 if A,
0 if A
c
.
80 3 An Introduction to Stochastic Differential Equations
Clearly, sup
0t1
X(, t) is not measurable. Hence, it does not dene a random
variable. In order to overcome this difculty involving supremum and inmum, we
impose the condition of separability of stochastic processes.
Denition 3.16. The process X =
_
X(t), t T
_
is said to be separable if there is a
countable dense subset S of T, called the separating set, and a set
0
with P(
0
) =
0, called the negligible set, such that if
c
0
and t T, there is a sequence
s
n
S, s
n
t, with X(, s
n
) X(, t).
The following proposition is well known.
Proposition 3.12. Every real stochastic process X =
_
X(t), t T
_
possesses a sep-
arable version. Moreover, if a separable stochastic process X is continuous in prob-
ability, then any countable dense subset in T is a separating set.
Proof. See, e.g., Ash and Gardner [12].
Remark 3.5. By virtue of Proposition 3.12, we may therefore only consider separa-
ble stochastic processes.
on which is dened a positive random variable Z with continuous distribution
P(Z = x) = 0 for each x. For t 0, put X(, t) = 0 for all , and put
Y(, t) =
_
_
_
1 if Z() =t,
0 if Z() ,=t.
Since Z has continuous distribution, P
_
: X(, t) ,= Y(, t)
_
= P
_
: Z() =
t
_
=0 for each t, and so X andY are versions of one another. However, the stochastic
process Y is not separable unless the separating set S contains the point Z(). The
set of for which Y(, ) is separable with respect to S is thus contained in
_
:
Z() S
_
, a set of probability zero since S is countable and Z has a continuous
distribution.
Denition 3.17. Altration is a family (F
t
)
t0
of increasing sub--elds of F (i.e.
F
t
F
s
F for all 0 t < s < ). The ltration is said to be right continuous
if F
t
=

s>t
F
s
for all t 0. When the probability space is complete, the ltration
is said to satisfy the usual conditions if it is right continuous and F
0
contains all
P-null sets.
From now on, unless otherwise specied, we shall always be working on a l-
tered probability space (, F, (F
t
)
t0
, P), where the ltration (F
t
)
t0
satises the
usual conditions.
Let X =
_
X(t), t [0, )
_
be an R-valued stochastic process.
Example 3.7. (Nonseparablestochastic process) Consider aprobabilityspace(,F, P)
3.2 Stochastic Processes 81
Denition 3.18. X is said to be adapted if for every t, X(t) is F
t
-measurable. It
is said to be measurable if the stochastic process regarded as a function of two
variables (, t) from[0, ) to Ris FB([0, ))-measurable, where B([0, ))
is the family of all Borel subsets of [0, ).
Denition 3.19. Let X =
_
X(t), t [0, )
_
be a stochastic process. The natural
ltration F
X
t
= (X(s), 0 s t) of X is the smallest ltration with respect to
which X is adapted.
Example 3.8. Let ([0, 1], B([0, 1]), ) be a probability space dened in Example 3.1
and a random variable X() = dened on this space. Now, consider a stochastic
process Y : [0, 1] R dened by Y(, t) = X(). Clearly, the ltration F
Y
t
of
Y is
F
Y
t
=
_
_
0st
(Y(s))
_
=(X) .
Because X() = is the identity random variable, the -eld (X) generated
by the random variable X is B([0, 1]). Thus, the natural ltration of Y is F
Y
t
=
B([0, 1]), t 0.
Denition 3.20. The stochastic process X is said to be progressively measurable or
progressive if for every T 0,
_
X(t), 0 t T
_
regarded as a function of (, t)
from [0, T] to R is F
t
B([0, T])-measurable, where B([0, T]) is the family
of all Borel subsets of [0, T].
Proposition 3.13. If the process (X
t
) is progressively measurable, then is also mea-
surable.
Proof. Let B B(R). Then
X
1
(B) =
_
(, s) R
+
: X(, s) B
_
=

_
n=0
_
(, s) [0, n] : X(, s) B
_
.
Since
_
(, s) [0, n] : X(, s) B
_
F
n
B([0, n])
for all n 0, we have that X
1
(B) F B(R
+
).
Before giving an example of a progressively measurable stochastic process, we
need the following denition.
Denition 3.21. A stochastic process X =
_
X(, t), t 0
_
is said to be (1) right-
continuous, if P-almost all of its paths t X(, t) are right-continuous, i.e., if
X(, t) = lim
st
X(, s) for all t R
+
,
(2) left-continuous, if P-almost all of its paths t X(, t) are left-continuous,
i.e., if X(, t) = lim
st
X(, s) for all t R
+
,
82 3 An Introduction to Stochastic Differential Equations
_
X(, t), t
0
_
is progressively measurable.
To see this, let us assume that X is a right-continuous stochastic process and
dene the sequence of stochastic processes
X
n, t
(, s) =
_
_
_
X(,
kt
n
) if
(k1)t
n
< s
kt
n
, k = 1, . . . , n,
X(, 0) if s = 0,
for n = 1, 2, . . . , n and t 0. Now, for any F F, write
_
(, s) : , 0 s t, X
n, t
(, s) F
_
=
_
: X(, 0) F
_

_
0
_

n
_
k=1
__
: X(,
kt
n
) F
_

_
(k 1)t
n
,
kt
n
__
.
Clearly, this set belongs to F
t
B([0, t]) since X is adapted. Hence, for each n 1
and t 0, X
n, t
is F
t
B([0, t])-measurable. By the right continuity of X we have
X
n, t
(, s) X(, s) as n for all and 0 s t. Since limits of measur-
able functions are measurable, we conclude that X is progressively measurable.
It is worth mentioning that the class of progressively measurable process is too
large. Motivated by this remark, we dene the so-called predictable process.
Let L denote the family of all real-valued functions Y(, t) dened on R
+
which are measurable with respect to F B(R
+
) and have the following proper-
ties:
(i) Y = (Y
t
) is adapted to (F
t
),
(ii) For each , the function t Y(, t) is left-continuous.
Now, let P be the smallest -eld of subsets of R
+
with respect to which
all the functions belonging to L are measurable.
Denition 3.22. Astochastic process X = (X
t
) is predictable if the function (, t)
X(, t) is P-measurable. Alternatively, the predictable processes are sometimes
called previsible.
Predictable processes are extensively used as integrands for stochastic integrals
and are often not restricted to the adapted and left-continuous case.
Below are some simple examples of predictable processes.
Example 3.10. All FB(R
+
)-measurable, adapted, and left-continuous processes
are predictable.
Example 3.11. A simple process (
t
) which is dened to be of the form
(, t) =
0
()I
0
(t) +
n1

j=0

j
()I
(t
j
,t
j+1
]
(t), (0 =t
0
<t
1
< . . . <t
n
) (3.8)
is predictable if each
j
is F
t
-measurable.
The process given by (3.8) is adapted and left-continuous.
Example 3.9. Anyright- or left-continuous adaptedstochasticprocess X=
3.2 Stochastic Processes 83
Example 3.12. Let
t
be an adapted, right-continuous step process given by
(, t) =
n

i=0
(, t
j
)1
[t
j
,t
j+1
)
(t) .
Let (
t
) be the process dened by(, t) =(, t

), the left limit of (, ). Then


(
t
) is predictable.
The -eld P has another characterization.
Proposition 3.14. The -eld P is generated by all sets of the form
A(s, t], 0 s <t <, A F
s
or A0, A F
0
.
Proof. We follow the proof given in Kallianpur [107]. Denote by U the class of all
functions of (, t) of the form I
B
()I
(u,v]
(t), where B F
0
and u, v R
+
(u v)
or of the form I
B
()I
0
(t), where B F
0
. Clearly, each member of U is P-
measurable, so that (U ) P, where (U ) is the smallest -eld with respect
to which all functions in U are measurable. To prove the converse inclusion, let
L. Then, for each (, t), is the limit of a sequence of step processes
of the form given in Example 3.11. Such a sequence is given by (
n
), where

n
(, t) =
0
()I
0
(t) +
j
n
1

j=0
(, t
n
j
)I
(t
n
j
,t
n
j+1
]
(t) and 0 = t
n
0
< t
n
1
< . . . < t
n
j
n
) is
a subdivision of [0, n] such that the length of each subinterval is less than or equal
to
1
n
. Since (, t
n
j
) is F
t
n
j
-measurable, it is the pointwise limit of a sequence of step
functions of the

i

i
I
B
i
() where B
i
F
t
n
j
. Hence,
n
(, t) is measurable with
respect to (U ), which implies the measurability of (, t) with respect to (U ).
This shows that P(U ) and completes the proof.
Remark 3.6. The -eld P given in Proposition 3.14 is called a predictable -
eld and its elements are called predictable sets. It plays an essential role in the
construction of stochastic integrals.
The following result gives the relation between predictable and progressively
measurable processes.
Proposition 3.15. Every predictable stochastic process is progressively measur-
able.
Proof. For a proof, see, e.g., Meyer [141].
3.2.3 Stopping Times
In what follows we are given a ltered probability space (, F, F
t
, P). We are often
interested in events that occur at a random time. A random time is simply a [0, )-
valued random variable on the probability space. A very special class of random
times is the so-called class of stopping times.
84 3 An Introduction to Stochastic Differential Equations
More precisely, we have the following denition.
Denition 3.23. A random time is a stopping time for the ltration (F
t
)
t0
if
_
t
_
F
t
for every t 0.
The stopping time is said to be nite if P( =) = 0.
Suppose is a stopping time for the ltration (F
t
)
t0
. The -eld F

is dened
to be the set of events A F such that A t F
t
for every t 0. F

can be
viewed as the set of events determined prior to the stopping time .
Example 3.13. Any positive constant is a stopping time.
To see this, let = a, where a is a nonnegative number. Then, for t 0,
_
: () t
_
=
_
_
_
/ 0 if a >t,
if a t.
Hence, for t 0,
_
: () t
_
F
t
.
Example 3.14. If X =
_
X
n
, n 0
_
is a sequence of real-valued random variables
and F
n
=(X
0
, X
1
, . . . , X
n
), the hitting time
() = inf
_
n 0 : X
n
() > a
_
, a R,
is an F
n
-stopping time.
To prove this, observe that
_
n
_
=
_
X
0
> a
_ n
_
k=1
_
X
0
a, . . . , X
k1
a, X
k
> a
_
.
Now, note that
_
n
_
consists of nite intersections and unions of events in F
k
and that F
k
F
n
for k n. Thus,
_
n
_
F
n
.
Stopping time has some nice properties.
Proposition 3.16. (i) If
1
and
2
are stopping times, then
1

2
= inf
_

1
,
2
_
and

2
= sup
_

1
,
2
_
are also stopping times.
(ii) If is a stopping time and a [0, ), then a is also a stopping time.
(iii) If is a nite stopping time, then it is F

-measurable.
(iv) If
1
and
2
are stopping times and
1

2
, then F

1
F

2
.
Proof. See, e.g., M etivier [140].
3.2.4 Gaussian Processes
Denition 3.24. The real-valued stochastic process X =
_
X(t), t T
_
is called a
Gaussian process if, for any nite subset F T, the random vector X
F
:=
_
X(t), t
3.2 Stochastic Processes 85
F
_
has multivariate Gaussian distribution, with probability density
f
F
(x) =
1
(2)
n/2

det
exp
_

1
2
(x)
/

1
(x)
_
,
with parameters R
n
and .
(Here, y
/
denotes the transpose of the vector y.) The quantity is a symmetric
positive-denite nn matrix,
1
is its inverse, and det its determinant.
Equivalently, X is Gaussian if every nite linear combination

tF
X(t) has a Gaus-
sian distribution on R. The covariance function of Gaussian process X is the bivari-
ate function
R(s, t) = Cov(X(s), X(t)) = E
_
_
X(s) EX(s)
__
X(t) EX(t)
_
_
.
Like the Gaussian vector, it is important to note that the mean function and
covariance function of a Gaussian process completely determine all of the nite-
dimensional distributions.
Example 3.15. Let Z = (Z
1
, . . . , Z
m
) R
m
be a Gaussian vector. Dene X as follows:
X(t) =
m

k=1
Z
k
w
k
(t), t 0
where w
k
(t), k = 1, . . . , m are real-valued, deterministic, and continuous functions.
We claim X is a Gaussian process. Indeed, let X
n
= (X(t
1
), . . . , X(t
n
)), where n 1
is an integer and (t
1
, . . . , t
n
) denote arbitrary elements in [0, ). The vector X
n
can be
expressed as a linear transformation of the Gaussian vector Z so that it is Gaussian.
3.2.5 Martingales
In this subsection, we introduce and study a very important class of stochastic pro-
cesses: the so-called martingales. Martingales arise naturally in many branches of
the theory of stochastic processes. In particular, they play a key role in the study of
the Brownian motion. They are also crucial for the understanding of the It o integrals.
Indenite It o integrals are constructed in such a way that they constitute martingales.
Throughout this subsection, the index set T denotes an arbitrary interval of R
+
.
Denition 3.25. The stochastic process X =
_
X(t), t T
_
is called a continuous-
time martingale with respect to the ltration
_
F
t
, t T
_
, we write
_
X, (F
t
)
_
, if
(i) E

X(t)

< for all t T;


(ii) X is adapted to F
t
;
(iii)
E
_
X(t) [ F
s

= X(s) Pa.s. (3.9)


for all s <t in T.
86 3 An Introduction to Stochastic Differential Equations
It follows from the denition of conditional expectations that the identity (3.9) is
equivalent to the statement
_
F
X(t) dP =
_
F
X(s) dP, for F F
s
, 0 s t
and that the expectation function EX is constant (that is, E(X(s)) = E(X(t)) for all
s and t).
When equality is substituted with , the process is called supermartingale. When
it is substituted with , the process is called submartingale.
It is also possible to dene a discrete-time martingale X =
_
X
n
, n = 0, 1, 2, . . .
_
.
In this case, the property (3.9) becomes
E
_
X
n+k
[ F
n

= X
n
, k 0.
The basic properties of conditional expectations give us the following properties
of a martingale.
Proposition 3.17. Let X be an integrable random variable and (F
t
)
tT
a ltra-
tion. For t T, dene M(t) = E
_
X [ F
t

. Then M =
_
M(t), t T
_
is an (F
t
)-
martingale and M is uniformly integrable. In addition, if is a convex function
such that E

(M(t))

< for all t T, then the stochastic process (M) is a sub-


martingale.
Proof. By the Jensen inequality for conditional expectation (Proposition 3.10 (vii))
and the integrability of X, we have
E

M(t)

E
_
E
_

[F
t

_
= E

<.
Also, M is F
t
-adapted because E
_
X [ F
t
_
is F
t
-measurable for each t 0. Proper-
ties of the conditional expectation give
E
_
M(t) [ F
s
_
= E
_
E
_
X(t) [ F
t

[ F
s
_
= E
_
X(t) [ F
s

= M(s)
for all s t. Hence, M obeys the properties of a martingale.
Similarly, the Jensen inequality applied to a convex function and properties of
conditional expectation yield
E
_
(M(t)) [ F
s
_

_
E
_
E
_
X [ F
t

[ F
s
__
=
_
E
_
X [ F
s

_
=(M(s)) .
Thus, (M) is a submartingale.
Let us now dene a Brownian motion which plays a key role in the construction
of stochastic integrals.
Denition 3.26. A (standard one-dimensional) Brownian motion is a continuous
adapted real-valued process (B(t), t 0) such that
3.2 Stochastic Processes 87
(i) B(0) = 0;
(ii) B(t) B(s) is independent of F
s
for all 0 s <t;
(iii) B(t) B(s) is N (0, t s)-distributed for all 0 s t.
Note that the Brownian motion B has the following properties:
(a) B has independent increments, that is, for t
1
< t
2
< < t
n
, B(t
1
)
B(0), B(t
2
) B(t
1
), . . . , B(t
n
) B(t
n1
) are independent random variables;
(b) B has stationary increments, that is, B(t +s) B(t) has the same distribution
as B(s) B(0).
The following proposition gathers some simple examples of stochastic processes
which have the martingale property.
Proposition 3.18. Let
_
B(t), t 0
_
be a Brownian motion, and dene F
t
=
_
B(s);
s t
_
. Then the following stochastic processes are martingale with respect to the
same ltration:
(i) (B(t), F
t
)
t0
itself;
(ii) (B(t)
2
t, F
t
)
t0
;
(iii) for every R, the process
_
exp
_
B(t)

2
2
t

, F
t
_
t0
(called an expo-
nential martingale).
Proof. Let us rst verify that
_
B(t), F
t0
_
t0
is a martingale. Since B(t) N (0, t),
B(t) is clearly integrable, and second, since B(t) B(s) is independent of B(s) by
Denition 3.26(ii), E
_
B(t) B(s) [ F
s
_
= 0, equivalently, E
_
B(t) [ F
s
_
=B(s).
Likewise, using the properties of conditional expectation,
E
_
_
B(t) B(s)
_
2
[ F
s
_
= E
_
B(t)
2
2B(t)B(s) +B(s)
2
[ F
s
_
= E
_
B(t)
2
[ F
s
_
B(s)
2
. (3.10)
On the other hand, since B(t) B(s) N (0, t s) is independent of B(s),
E
_
_
B(t) B(s)
_
2
[ F
s
_
= E
_
_
B(t) B(s)
_
2
_
=t s . (3.11)
Hence, combining (3.10) and (3.11) we obtain
E
_
B(t)
2
t [ F
s
_
=B(s)
2
s .
This concludes that B(t)
2
t is an (F
t
)-martingale.
As to part (iii), since B(t) B(s) N (0, t s), its moment-generating function
is
E
_
exp
_
(B(t) B(s))

_
= exp
_
1
2

2
(t s)
_
,
for any R and 0 s t.
Then, using the fact that B(t) B(s) is independent of F
s
,
88 3 An Introduction to Stochastic Differential Equations
E
_
exp
_
B(t)
1
2

2
t

[ F
s
_
= E
_
exp
_
(B(t) B(s)) +B(s)
1
2

2
t

[ F
s
_
= exp
_
B(s)
1
2

2
t

E
_
exp
_
(B(t) B(s))

_
= exp
_
B(s)
1
2

2
t

exp
_
1
2

2
(t s)

= exp
_
B(s)
1
2

2
s

.
Hence, the process
_
exp
_
B(t)

2
2
t

, F
t
_
t0
is a martingale.
For continuous martingales we have the following inequalities due to Doob.
Theorem 3.3. (Doobs inequality) Let
_
M(t)
_
0tT
be a continuous martingale.
(i) If p 1 and M(t) L
p
(; R), then
P
_
: sup
0tT

M(, t)

> c
_

M(T)

p
c
p
;
(ii) If p > 1 and M(t) L
p
(; R), then
E
_
sup
0tT
M(t)
p
_

_
p
p1
_
p
E
_
M(T)
p
_
.
Further discussions on this topic may be found in Stroock and Varadhan [168] or
Revuz and Yor [158].
3.3 Stochastic Integrals in One Dimension
3.3.1 Motivation
In applications, it is typical to characterize the current state of a physical system
by a real function of time x(t), t 0, called the state. Generally, the behavior of a
physical system based on an input w(t) for t 0, can be specied by a differential
equation of the form
dx(t)
dt
= (x(t)) +(x(t))w(t), t 0, (3.12)
where the functions and depend on the system properties. In classical analysis,
the study of the solutions of such an equation is based on the assumptions that the
system properties and the input are perfectly known and deterministic.
Here, we generalize Eq. (3.12) by assuming that the input is a real stochastic
process. Because the input is random, the state becomes a real stochastic process.
3.3 Stochastic Integrals in One Dimension 89
Now, let X denote the solution of (3.12) with w replaced by a stochastic process Z.
It is customary to assume that Z is a white noise process for which E
_
Z(t)

= 0
and Cov(Z(s), Z(t)) = 1 if s = t and is zero otherwise. It is important to note that
for t
1
<t
2
<t
3
,
Cov
_
_
t
2
t
1
Z(s) ds,
_
t
3
t
2
Z(s) ds
_
= 0 (3.13)
whereas
Var
_
_
t
0
Z(s) ds
_
=t . (3.14)
The Gaussian white noise process is often used. Such a stochastic process
_
Z(t), t
R
_
has irregular sample paths and is very difcult to work with directly. As a result,
it is easier to work with its integral. This suggests writing (3.12) in the form
X(t) = X(0) +
_
t
0
(X(s)) ds +
_
t
0
(X(s))Z(s) ds . (3.15)
In this integrated version, we need to make mathematical sense of the stochastic
integral involving the integrator Z(s) ds. From a notational standpoint, it is common
to write
dX(t) = (X(t)) dt +(X(t))Z(t) dt . (3.16)
Note that given a Brownian motion B, it is not difcult to verify that
Cov
_
B(t
2
) B(t
1
), B(t
3
) B(t
2
)
_
= 0 and Var
_
B(t) B(0)
_
=t .
Given the similarity with (3.13) and (3.14), the latter hints that B can be viewed as
integrated white noise so that we can rigorously dene
_
t
0
Z(t) dt to be B(t). This
is quite an oversimplication. To write B(t) =
_
t
0
Z(t) dt would require that B is
differentiable almost everywhere (in time t). Unfortunately, this is not the case: B is
non differentiable at t. This oversimplication comes from the fact that white noise
does not exist as a well-dened stochastic process. On the other hand, Brownian
motion is well dened, so this suggests that we should replace (3.15) with
X(t) = x
0
+
_
t
0
(X(s)) ds +
_
t
0
(X(s))dB(s) (3.17)
and (3.16) with
_
_
_
dX(t) = (X(t)) dt +(X(t)) dB(t)
X(0) = x
0
.
(3.18)
Note that in (3.17), the integral
_
t
0
(X(s)) ds can be dened via a standard Rie-
mann approximation. On the other hand,
_
t
0
(X(s))dB(s) must be dened differ-
ently since the integrator is a non differentiable stochastic process.
This leads us to outline the construction of the so-called It o integral.
90 3 An Introduction to Stochastic Differential Equations
3.3.2 It o Integrals
Let (, F, F
t
, P) be a ltered probability space and B =
_
B(t), t 0
_
be a one-
dimensional Brownian motion dened on this space.
Denition 3.27. Let 0 S < T < . Denote by V ([S, T]; R) the space of all real-
valued measurable (F
t
)-adapted processes =
_
(t), t 0
_
such that
_
_

_
_
2
V
= E
_
_
T
S

(t)

2
dt
_
<.
We identify and

in V ([S, T]; R) if
_
_

_
_
2
V
= 0. In this case we say that
and

are equivalent and we write =

.
It is routine to show that the space V ([S, T]; R) equipped with the norm
_
_

_
_
V
is
a Banach space. Furthermore, without loss of generality we may assume that every
stochastic process V ([S, T]; R) is predictable.
Since full details on the construction of the It o integral
_
T
S
(t) dB(t) for
stochastic processes V ([S, T]; R) can be found in either ksendal [150] or
Mao and Yuan [135], here we shall outline only its construction. The idea of con-
struction is as follows. First dene the integral
_
T
S
(t) dB(t) for a class of simple
processes . Then we show that each V ([S, T]; R) can be approximated by
such simple processes s and we dene the limit of
_
T
S
(t) dB(t) as the integral
_
T
S
(t) dB(t).
Let us rst introduce the concept of simple stochastic processes.
Denition 3.28. A stochastic process V ([S, T]; R) is called simple if it is of the
form
(, t) =
0
() 1
[t
0
, t
1
)
(t) +
k1

i=0

i
() 1
(t
i
, t
i+1
]
(t) ,
with a partition S = t
0
< t
1
< . . . < t
k
= T of [S, T] and bounded F
t
i
-measurable
random variables
i
, 0 i k 1.
For any simple stochastic process V ([S, T]; R) we dene
_
T
S
(t) dB(t) :=
k1

i=0

i
[B(t
i+1
) B(t
i
)] . (3.19)
Obviously, the integral
_
T
S
(t) dB(t) is a well-dened random variable. Moreover,
the following properties hold:
3.3 Stochastic Integrals in One Dimension 91
E
_
_
T
S
(t) dB(t)
_
= 0, (3.20)
E

_
T
S
(t) dB(t)

2
=
_
T
S
E

(t)

2
dt . (3.21)
To prove these identities, note that
i
is F
t
i
-measurable and that B(t
i+1
) B(t
i
)
is independent of F
t
i
. Hence,
E
_
T
S
(t) dB(t) =
k1

i=0
E
_

i
_
B(t
i+1
) B(t
i
)

_
=
k1

k=0
E(
i
)E
_
B(t
i+1
) B(t
i
)

= 0.
Moreover, note that B(t
j+1
) B(t
j
) is independent of
i

j
(B(t
i+1
) B(t
i
)) if i < j.
Thus,
E

_
T
S
(t) dB(t)

2
=
k1

0i, jk1
E
_

j
_
B(t
i+1
) B(t
i
)
__
B(t
j+1
) B(t
j
)
_
_
=
k1

k=0
E
_

2
i
E
_
B(t
i+1
) B(t
i
)
_
_
=
k1

i=0
E(
2
i
)E
_
B(t
i+1
) B(t
i
)
_
2
=
k1

k=0
E(
2
i
)(t
i+1
t
i
) = E
_
_
T
S

(t)

2
_
.
Also, for any simple stochastic processes
1
,
2
V ([S, T]; R) and c
1
, c
2
R,
we have
_
T
S
_
c
1

1
(t) +c
2

2
(t)

dB(t) = c
1
_
T
S

1
(t) dB(t) +c
2
_
T
S

2
(t) dB(t) . (3.22)
The proof of (3.22) is left to the reader as an exercise.
We can now extend the It o integral from simple stochastic processes to stochastic
processes in V ([S, T]; R). This is based on the following approximation result.
Lemma 3.2. For any V ([S, T]; R), there exists a sequence (
n
) of simple
stochastic processes such that
lim
n
_
T
S
E

(t)
n
(t)

2
dt = 0.
We are now prepared to outline the construction of the It o integral for a stochas-
tic process V ([S, T]; R). By Lemma 3.2, there is a sequence (
n
) of simple
stochastic processes such that
92 3 An Introduction to Stochastic Differential Equations
lim
n
_
T
S
E

(t)
n
(t)

2
dt = 0.
Thus, by property (3.20),
E

_
T
S

n
(t) dB(t)
_
T
S

m
(t)dB(t)

2
=
_
T
S
E

n
(t)
m
(t)

2
dt 0 as m, n .
Hence, the sequence
_
_
T
S

n
(t) dB(t), n 1
_
is a Cauchy sequence in L
2
(; R)
which, in turn, implies that it is convergent.
This leads us to the following denition.
Denition 3.29. Let V ([S, T]; R). The It o integral with respect to (B(t)) is
dened by
_
T
S
(t) dB(t) = lim
n
_
T
S

n
(t) dB(t) in L
2
(, R) ,
where (
n
) is a sequence of simple stochastic processes such that
lim
n
E
_
_
T
S

(t)
n
(t)

2
dt
_
= 0.
It is important to note that this integral does not depend on the choice of approxi-
mating sequence.
We now gather the main properties of the It o integral.
Proposition 3.19. Let , be stochastic processes in V ([S, T]; R), and let 0 S <
U < T. Then
(a) E
_
_
T
S
(t) dB(t)
_
= 0;
(b) E

_
T
S
(t) dB(t)

2
=
_
T
S
E

(t)

2
dt (It o Isometry);
(c)
_
T
S
(c(t) +(t)) dB(t) = c
_
T
S
(t) dB(t) +
_
T
S
(t) dB(t) (c constant);
(d)
_
T
S
(t) dB(t) =
_
U
S
(t) dB(t) +
_
T
U
(t) dB(t);
(e)
_
T
S
(s) dB(s) is F
T
-measurable.
The proof is left to the reader as an exercise.
Denition 3.30. Let V ([0, T]; R). Dene
I(t) :=
_
t
0
(s) dB(s), for 0 t T ,
where, by denition, I(0) = 0. We call I(t) the indenite It o integral of .
The indenite integral I(t) has the following interesting properties.
3.3 Stochastic Integrals in One Dimension 93
Proposition 3.20. The following properties hold.
(i) I(t) is F
t
-adapted and square-integrable;
(ii) I(t), t 0
_
is an F
t
-martingale and
E
_
sup
0tT

I(t)

2
_
4
_
T
0
E

(t)

2
ds ; (3.23)
(iii)
_
I(t), 0 t T
_
has a continuous version.
Proof. Clearly, for each t in [0, T], I(t) is F
t
-adapted and square-integrable. To
prove part (ii), we x 0 s <t T and use the properties of conditional expectation
and Brownian motion to obtain
E
_
I(t) [ F
s
_
= E
_
_
s
0
(r) dB(r) [ F
s
_
+E
_
_
t
s
(r) dB(r) [ F
s
_
= E
_
I(s) [ F
s
_
+E
_
_
t
s
(r) dB(r) [ F
s
_
= I(s) .
The inequality (3.23) follows from Doobs martingale inequality.
As to (iii), let (
n
) be a sequence of simple stochastic processes such that
lim
n
_
T
0
E

(s)
n
(s)

2
ds = 0.
Note from the continuity of the Brownian motion that the indenite integrals
I
n
(t) =
_
t
0

n
(s) dB(s), 0 t T
are continuous. By Proposition 3.20(ii), the stochastic process
_
I
n
(t)I
m
(t), t 0
t T
_
is a martingale for each pair of integers n, m. Hence, by Doobs martingale
inequality, it follows that for any > 0,
P
_
: sup
0tT

I
n
(, t) I
m
(, t)

2
E

I
n
(T) I
m
(T)

2
=
1

2
_
T
0
E

n
(s)
m
(s)

2
ds 0 as n, m .
Hence, we may choose a subsequence n
k
such that
P
_
: sup
0tT

I
n
k+1
(, t) I
n
k
(, t)

2
k
_
2
k
.
By the BorelCantelli lemma, we have
94 3 An Introduction to Stochastic Differential Equations
P
_
: sup
0tT

I
n
k+1
(, t) I
n
k
(, t)

2
k
for innitely many k
_
= 0.
That is, there exists a set
0
F with P(
0
) = 0 and a positive integer k() such
that for every
c
0
,
sup
0tT

I
n
k+1
(, t) I
n
k
(, t)

2
k
for k k() .
Therefore, (I
n
k
(, )) is uniformly convergent on [0, T] for each
c
0
and the
limit, denoted by J(, t), is continuous in t [0, T]. Since I
n
k
(, t) I(, t) for all t,
we must have
I(t) = J(t) a.s. for all t [0, T] .
3.3.3 It o Integrals with Stopping Time
Let be a stopping time and dene
1
[0,]
(t) =
_
_
_
1 if t ,
0 if t >,
the indicator function of [0, ]. Then, the stochastic process
_
1
[0,]
(t), t 0
_
is
F
t
-adapted. Indeed, for each t 0,
_
: 1
[0,()]
(t) a
_
=
_

_
/ 0 if a < 0,
_
: () t
_
F
t
if 0 a < 1,
F
t
if a 1.
Thus,
_
1
[0,]
(t), t 0
_
is F
t
-adapted. It is also predictable.
We can now dene the stochastic integrals with stopping time.
Denition 3.31. Let V ([0, T]; R) and let be an F
t
-stopping time such that
0 T. Dene
_

0
(s) dB(s) :=
_
T
0
1
[0,]
(s)(s) dB(s) . (3.24)
Furthermore, if is another stopping time with 0 , we dene
_

(s) dB(s) =
_

0
(s) dB(s)
_

0
(s) dB(s) . (3.25)
Note that the integral (3.24) is well dened because the stochastic process
3.3 Stochastic Integrals in One Dimension 95
_
1
[0,]
(t)(t), t [0, T]
_
belongs to V ([0, T]; R). Hence, the integral (3.25) can be
rewritten as follows:
_

(s) dB(s) :=
_
T
0
1
[,]
(s)(s) dB(s) . (3.26)
The following properties can be deduced easily from Proposition 3.19.
Proposition 3.21. Let V ([0, T]; R) and let , be two stopping times such that
0 T. Then
(i) E
_
_

(s) d B(s)
_
= 0,
(ii) E

(s) dB(s)

2
= E
_
_

(s)

2
ds
_
.
3.3.4 It o Formula
In the last two subsections we dened the It o integral of the form
_
t
0
(s) dB(s) and
collected its properties. However, with the exception of simple processes we do not
have tools to calculate It o integrals and to proceed some simple operations on them.
It is now our objective to provide such a tool like the it o formula. Here, we present
two versions of the It o lemma.
In what follows, we use the following notation for the partial derivatives of f :
f
i
(t, x) =

x
i
f (x
1
, x
2
)

x
1
=t, x
2
=x
i = 1, 2
f
i j
(t, x) =

x
i
x
j
f (x
1
, x
2
)

x
1
=t, x
2
=x
i, j = 1, 2.
Theorem 3.4. (Version I of It o formula) Let f (t, x) be a function whose second-
order partial derivatives exist and are continuous. Then
f (t, B(t)) f (s, B(s))
=
_
t
s
_
f
1
(, B()) +
1
2
f
22
(, B())
_
d +
_
t
s
f
2
(, B()) dB(), s <t .
Proof. Assume that f (t, x) has continuous partial derivatives of at least second or-
der. Write B(t +dt) B(t) for the increment of B on [t, t +dt]. Using Taylor expan-
sion we can write
f (t +dt, B(t +dt)) f (s, B(s)) (3.27)
= f
1
(t, B(t)) dt + f
2
(t, B(t)) dB(t)
+
1
2
_
f
11
(t, B(t)) +2 f
12
(t, B(t)) dt dB(t) + f
22
(t, B(t))(dB(t))
2
_
+ .
96 3 An Introduction to Stochastic Differential Equations
Now, let us introduce formally a multiplication table.
dt dt = 0, dB(t) dt = 0,
dB(t) dB(t) = dt, dt dB(t) = 0.
As in classical calculus, higher-order terms in (3.27) are negligible, and so are the
terms with factors dt dB(t) and (dt)
2
. However, since we interpret (dB(t))
2
as dt,
the term with (dB(t))
2
cannot be neglected. We then have
f (t, B(t)) f (s, B(s))
=
_
t
s
_
f
1
(, B()) +
1
2
f
22
(, B())
_
d +
_
t
s
f
2
(, B()) dB(), s <t ,
as desired.
Theorem 3.5. (Version II of It o formula) Let X be an It o process given by
dX(t) = a(t) dt +b(t) dB(t) (3.28)
with both, a(t) and b(t), being adapted to Brownian motion B, and let f (t, x) be a
function whose second-order partial derivatives exist and are continuous. Then
f (t, X(t)) f (s, X(s)) (3.29)
=
_
t
s
_
f
1
(, X()) +a() f
2
(, X()) +
1
2
b()
2
f
22
(, X())
_
d
+
_
t
s
b() f
2
(, X()) dB(), s <t .
Proof. To justify formula (3.29), we proceed as before. We use a Taylor expansion
for f (t +dt, X(t +dt)) f (t, X(t)) as in (3.27), where B is replaced with X, and X
is dened in (3.28). Now, neglecting high-order terms, starting with terms involving
(dt)
2
and dt dB(t), and making use of (dB(t))
2
=dt, we obtain the desired formula.
Formula (3.29) is often written as follows:
f (t, X(t)) f (s, X(s))
=
_
t
s
_
f
1
(, X()) +
1
2
b()
2
f
22
(, X())
_
d
+
_
t
s
b() f
2
(, X()) dX(),
where
dX(t) = a(t) dt +b(t) dB(t) .
To illustrate the usefulness of It os formula, we provide some examples.
Example 3.16. Let us evaluate the stochastic integral
_
t
0
B(s) dB(s). To do this, take
f (t, x) =
1
2
x
2
. Noting B(0) = 0 and applying It os formula yield
3.3 Stochastic Integrals in One Dimension 97
1
2
B(t)
2
=
_
t
0
1
2
ds +
_
t
0
B(s) dB(s)
=
1
2
t +
_
t
0
B(s) dB(s) .
Hence,
_
t
0
B(s) dB(s) =
1
2
B(t)
2

1
2
t.
Example 3.17. Consider the following stochastic equation:
dX(t) = rX(t) +cX(t) dB(t) ,
where rX(t) dt represents exponential growth, r > 0, and where cX(t) dB(t) repre-
sents environmental variation, c >0. Now, take f (t, x) =lnx. Applying It os formula
gives
lnX(t) lnX(0) =
_
t
0
_
r
1
2
c
2
_
ds +c
_
t
0
dB(s)
ln
_
X(t)
X(0)
_
=
_
r
1
2
c
2
_
t +cB(t) .
Thus, X(t) = X(0)exp
_
_
r
1
2
c
2
_
t +cB(t)
_
.
3.3.5 Diffusion Process
To end this section, we now dene innitesimal drift and variance of a diffusion.
Let us rst introduce the following denition.
Denition 3.32. A stochastic process X =
_
X(t), t 0
_
is said to be a Markov
process if
E
_
f (X(t +s)) [ F
t
_
= E
_
f (X(t +s)) [ X(t)
_
for all t, s 0 and all f : R R Borel measurable functions such that
E

f (X(t))

< for all t.


Under reasonable conditions on () and (), there exists a solution X =
_
X(t), t 0
_
to (3.18). The state stochastic process X is called a diffusion or It o
process. It is a Markov process with continuous paths and is time-homogeneous in
the sense that
P
x
_
X(t +h) [ X(u) : 0 u t
_
= P(h, X(t), ) ,
where
P(h, x, B) = P
x
_
X(h) B
_
.
98 3 An Introduction to Stochastic Differential Equations
Note that when h > 0 is small,
X(h)X(0) =
_
h
0
(X(s))ds+
_
h
0
(X(s))dB(s) (X(0))h+(X(0))[B(h)B(0)].
Hence,
E
x
[X(h) x] = (x)h+o(h)
and
E
x
_
(X(h) x)
2

=
2
(x)h+o(h)
as h 0. As a result, (x) is called the innitesimal drift of the diffusion X at x and

2
(x) is the innitesimal variance of X at x.
Further discussions on this topic may be found in Doob [65] and Karlin and
Taylor [108].
3.4 Wiener Process and Stochastic Integrals in a Hilbert Space
In the previous section, we presented the elements of Stochastic Calculus for real
stochastic processes. These elements are also valid for stochastic processes taking
their values in a separable Hilbert space. However, the extensions can be connected
with some difculties when we would be interested, for instance, in analytical prop-
erties of sample paths of such processes.
Of interest to us will be operator-valued random variables and their integrals.
Let K and H be two separable Hilbert spaces with norms
_
_

_
_
K
,
_
_

_
_
H
and inner
products

,
_
K
,

,
_
H
, respectively. From now on, without further specication
we always use the same symbol
_
_

_
_
to denote norms of operators regardless of the
spaces involved when no confusion is possible.
3.4.1 Wiener Process in a Separable Hilbert Space
Let (, F, P, F
t
) be a ltered probability space and let
n
(t) (n = 1, 2, 3, . . . ) be a
sequence of real-valued standard Brownian motions mutually independent on this
space.
Set
W(t) =

n=1
_

n
(t)e
n
, t 0,
where
n
0 (n 1) are nonnegative real numbers and (e
n
)
n1
is the complete
orthonormal basis in K.
Let Q B(K, K) be the operator dened by Qe
n
=
n
e
n
such that
3.4 Wiener Process and Stochastic Integrals in a Hilbert Space 99
TrQ=

i=1

i
<.
Clearly, EW(t) = 0 and for all t s 0, the distribution of W(t) W(s) is
N (0, (t s)Q).
The above-mentioned K-valued stochastic process W(t) is called a Q-Wiener
process. In case the time set is R, W can be obtained as follows: let W
i
(t), t
R, i = 1, 2, be independent K-valued Q-Wiener processes, then
W(t) =
_
_
_
W
1
(t) if t 0,
W
2
(t) if t 0,
is a Q-Wiener process with R as time parameter and with values in K.
3.4.2 Stochastic Integrals in a Hilbert Space
In order to dene stochastic integrals with respect to the Q-Wiener process W, we
introduce the subset K
0
=Q
1
2
K, which is a Hilbert space equipped with the norm
_
_
u
_
_
K
0
=
_
_
Q
1/2
u
_
_
K
, u K
0
,
and dene a proper space of operators
L
0
2
=L
0
2
(K
0
, H) =
_
B(K
0
, H) : Tr
_
(Q
1/2
)(Q
1/2
)

<
_
,
the space of all HilbertSchmidt operators from K
0
into H. It turns out that L
0
2
is a
separable Hilbert space with norm
_
_

_
_
2
L
0
2
= Tr
_
(Q
1/2
)(Q
1/2
)

for any L
0
2
.
Clearly, for any bounded linear operator B(K, H), this norm reduces to
_
_

_
_
2
L
0
2
= Tr
_
Q

.
For any T 0, let =(t), t [0, T], be an F
t
-adapted, L
0
2
-valued process,
and for any t [0, T], dene the following norm:
||
t
:=
_
E
_
t
0
Tr
_
(Q
1/2
)(Q
1/2
)

ds
_
1/2
. (3.30)
In general, we denote all L
0
2
-valued predictable processes such that ||
T
<
by U
2
([0, T], L
0
2
). The stochastic integral
_
t
0
(s) dW(s) H may be well dened
100 3 An Introduction to Stochastic Differential Equations
for all U
2
([0, T]; L
0
2
) by
_
t
0
(s) dW(s) = L
2
lim
n
n

i=0
_
t
0
(s)
_

i
e
i
d
i
(s), t [0, T] ,
where Wis the Q-Wiener process dened above.
Proposition 3.22. For arbitrary T 0, let U
2
([0, T]; L
0
2
). Then the stochastic
integral
_
t
0
(s) dW(s) is a continuous, square integrable, H-valued martingale on
[0, T] and
E
_
_
_
t
0
(t) dW(s)
_
_
2
H
=
_
_

_
_
2
t
, t [0, T] . (3.31)
In fact, the stochastic integral
_
t
0
(s) dW(s), t 0 may be extended to any
L
0
2
-valued adapted process satisfying
P
_
:
_
t
0
_
_
(, s)
_
_
2
L
0
2
ds <, 0 t T
_
= 1.
Moreover, we may deduce the following generalized relation of (3.31):
E
_
_
_
t
0
(t) dW(s)
_
_
2
H
E
_
t
0
_
_
(s)
_
_
2
L
0
2
ds, 0 t T . (3.32)
Note that the equality holds if the right-hand side of this inequality is nite.
The following proposition is a particular case of the BurkholderDavisGundy
inequality.
Proposition 3.23. For any p 2 and for arbitrary L
0
2
-valued predictable process
(t), t [0, T], one has
E
_
sup
s[0,t]
_
_
_
s
0
(s) dW(s)
_
_
p
_
C
p
E
_
_
t
0
_
_
(s)
_
_
2
L
0
2
ds
_
p/2
(3.33)
for some constant C
p
> 0.
For a proof, see, e.g., Da Prato and Zabczyk [47] or Seidler and Sobukawa [164].
Finally, let us quote Theorem 3 from Da Prato and Zabczyk [46], which is a
stochastic version of the Fubini theorem and enables us to interchange stochastic
and Bochner integrals.
Proposition 3.24. Let (G, G, ) be a measure space, let h : [0, T] G L
0
2
be
an F B([0, T]) G-measurable mapping such that h(, , x) is an (F
t
)-adapted
stochastic process for each x G and
_
G
_
_
T
0
E
_
_
h(t, x)
_
_
2
L
0
2
_
1/2
d(x) <.
3.4 Wiener Process and Stochastic Integrals in a Hilbert Space 101
Then
_
G
_
_
T
0
h(t, x) dW(t)
_
d(x) =
_
T
0
_
_
G
h(t, x) d(x)
_
dW(t) P a.s.
3.4.3 Stochastic Convolution Integrals
Let (, F, P, F
t
) be a ltered probability space. Let =
_
(s, t) : 0 s < t T
_
and suppose that U =
_
U(t, s) : (s, t)
_
is an evolution operator as in Chapter 2.
Denote by M the -eld of (F
t
)-progressively measurable sets over R
+
and
by an M-measurable L
0
2
-valued process.
Dene
I(t) =
_
t
0
U(t, s)(s) dW(s), 0 t T .
This integral is well dened provided that
_
t
0
_
_
U(t, s)(s)
_
_
2
L
0
2
ds < P a.s., 0 t T .
Such an integral is called a stochastic convolution integral.
From the formula
_
t

(t s)
1
(s )

ds =

sin()
, for s t, (0, 1) ,
established in Da Prato, Kawapian, and Zabczyk [45], it follows that
_
t
0
U(t, s)(s) dW(s) =
sin

(R

)(t) a.s., (3.34)


where
(R

)(t) =
_
t
0
(t s)
1
U(t, s)Z(s) ds
with
S

(s) =
_
s
0
(s )

U(s, )() dW() .


The use of the representation (3.34) is the very core of the factorization method as
treated in Da Prato, Kawapian, and Zabczyk [45] and Da Prato and Zabczyk [48].
It is possible to derive estimates for I(t) provided that the evolution operator U is
exponentially stable, that is,
_
_
U(t, s)
_
_
Me
(ts)
(3.35)
for some constants M > 0 and > 0 and for all t s 0.
Let H

, [0, 1] be intermediate Banach spaces such that H


0
=H, H

is con-
tinuously embedded into H

whenever 1 0, and for each [0, 1] there


102 3 An Introduction to Stochastic Differential Equations
exists a constant L

such that
U(t, s) B(H, H

) and
_
_
U(t, s)
_
_

(t s)

for all 0 s <t T.


As in Chapter 2, we shall denote the norm in H

simply by
_
_

_
_

.
We now state the maximal inequality.
Proposition 3.25. Let p > 2, [0,
p2
2p
). Let : [0, T] L
0
2
be an (F
t
)-
adapted measurable stochastic process such that
_
T
0
E
_
_
(s)
_
_
p
L
0
2
ds <.
Then
E
_
sup
0tT
_
_
_
_
_
t
0
U(t, s)(s) dW(s)
_
_
_
_
p

_
C
_
T
0
E
_
_
(s)
_
_
p
L
0
2
ds ,
where the constant C depends only on p, , T, and U.
For the proof, we refer the reader to Seidler [163].
The above proposition can be extended to (F
t
)-adapted measurable stochastic
processes whose time set is R.
Proposition 3.26. Let p > 2, 0 < < 1, +
1
p
< <
1
2
, and : R L
0
2
be
an (F
t
)-adapted measurable stochastic process such that
sup
tR
E
_
_
(t)
_
_
p
L
0
2
<.
Then
(i) E
_
_
_
_
_
t

(t s)

U(t, s)P(s)(s) dW(s)


_
_
_
_
p
C
p
N
p
C
1
(, , , p)sup
tR
E
_
_
(t)
_
_
p
L
0
2
;
(ii) E
_
_
_
_
_
t

U(t, s)P(s)(s) dW(s)


_
_
_
_
p

C
p
M()
p
C
2
(, , , , p)C
1
(, , , p)sup
tR
E
_
_
(t)
_
_
p
L
0
2
;
(iii) E
_
_
_
_
_

t
U(t, s)Q(s)(s) dW(s)
_
_
_
_
p

C
p
M()
p
C
3
(, , , , p)C
1
(, , , p)sup
tR
E
_
_
(t)
_
_
p
L
0
2
where
3.4 Wiener Process and Stochastic Integrals in a Hilbert Space 103
C
1
(, , , p) = N
p
_

_
1
2p
p2
_
_
p2
2
(2)
41
2
p
,
C
2
(, , , , p) =

sin()

p
_
1

__

_
1
p
p1
(1+ )
__
p1

p()
,
C
3
(, , , p) =

sin()

p
_
1

__

_
1
p
p1
(1)
__
p1

p
,
with a classical Gaussian function. Here, P(t), t R are projections that are
uniformly bounded and strongly continuous in t, Q(t) = I P(t), and W is a Q-
Wiener with values in K and with time set R.
Proof. (i) A direct application of Proposition 3.23 and H olders inequality with the
help of (3.35) allows us to write
E
_
_
_
_
_
t

(t )

U(t, )P()() dW()


_
_
_
_
p
C
p
E
_
_
t

(t )
2
_
_
U(t, )P()()
_
_
2
L
0
2
d
_
p/2
C
p
N
p
E
_
_
t

(t )
2
e
2(t)
_
_
()
_
_
2
L
0
2
d
_
p/2
C
p
N
p
_
_
t

(t )
2
e
2(t)
d
_
p1
_
_
t

e
2(t)
E
_
_
()
_
_
p
L
0
2
d
_
C
p
N
p
_
(1
2p
p2
)(2)
2p
p2
1
_
p2
2
_
1
2
_
sup
tR
E
_
_
(t)
_
_
p
L
0
2
C
p
C
1
(, , , p)sup
tR
E
_
_
(t)
_
_
p
L
0
2
.
To prove (ii), we use the factorization method of the stochastic convolution integral:
_
t

U(t, s)P(s)(s) dW(s) =


sin

(R

)(t) a.s.
where
(R

)(t) =
_
t

(t s)
1
U(t, s)P(s)S

(s) ds
with
S

(s) =
_
s

(s )

U(s, )P()() dW() ,


and satisfying +
1
p
< <
1
2
.
We can now evaluate E
_
_
_
_
_
t

U(t, s)P(s)(s) dW(s)


_
_
_
_
p

:
104 3 An Introduction to Stochastic Differential Equations
E
_
_
_
_
_
t

U(t, s)P(s)(s) dW(s)


_
_
_
_
p

sin()

p
E
_
_
t

(t s)

_
_
U(t, s)P(s)S

(s)
_
_

ds
_
p
M()
p

sin()

p
E
_
_
t

(t s)
1
e
(ts)
_
_
S

(s)
_
_

ds
_
p
M()
p

sin()

p
_
_
t

(t s)
p
p1
(1)
e
(ts)
ds
_
p1

_
_
t

e
(ts)
E
_
_
S

(s)
_
_
p
ds
_
M()
p
C
2
(, , , , p)sup
sR
E
_
_
S

(s)
_
_
p
. (3.36)
On the other hand, it follows from part (i) that
E
_
_
S

(t)
_
_
p
C
p
C
1
(, , , p)sup
tR
E
_
_
(t)
_
_
p
L
0
2
. (3.37)
Thus,
E
_
_
_
t

U(t, s)P(s)(s) dW(s)


_
_
p

C
p
M()
p
C
2
(, , , , p)C
1
(, , , p)sup
tR
E
_
_
(t)
_
_
p
L
0
2
.
To prove (iii), we also use the factorization method:
E
_
_
_
_
_

t
U(t, s)Q(s)(s) dW(s)
_
_
_
_
p

sin()

p
E
_
_

t
(s t)

_
_
U(t, s)Q(s)S

(s)
_
_

ds
_
p
M()
p

sin()

p
E
_
_

t
(s t)
1
e

2
(ts)
_
_
S

(s)
_
_
ds
_
p
M()
p

sin()

p
_
_

t
(s t)
p
p1
(1)
e

2
(st)
ds
_
p1

_
_

t
e

2
(st)
E
_
_
S

(s)
_
_
p
ds
_
M()
p
C
3
(, , , , p)sup
sR
E
_
_
S

(s)
_
_
p
. (3.38)
It follows from (3.37) that
E
_
_
_
_
_

t
U(t, s)Q(s)(s) dW(s)
_
_
_
_
p

C
p
M()
p
C
3
(, , , , p)C
1
(, , , p)sup
tR
E
_
_
(t)
_
_
p
L
0
2
.
3.5 Existence of Solutions of Stochastic Differential Equations in a Hilbert Space 105
3.5 Existence of Solutions of Stochastic Differential Equations in
a Hilbert Space
During the last few decades, stochastic differential equations in a separable Hilbert
space have been of great interest to several mathematicians and various results on
the existence, uniqueness, stability, and other quantitative and qualitative properties
of solutions have been established. For example, in their book [47], Da Prato and
Zabczyk established systematic theory of the existence and uniqueness and ergodic-
ity theory for innite-dimensional systems. The literature relative to those equations
is quite extensive; for more on this topic and related applications we refer the reader
to Appleby [9], Caraballo and Kai Liu [33], Kai Liu [106], and Luo [133, 134].
3.5.1 Existence and Uniqueness
Let us rst consider the following stochastic differential equation on H of the form
_
_
_
dX(t) = f (X(t)) dt +g(X(t)) dW(t), t [0, T]
X(0) = x
0
,
(3.39)
where f : HHand g : HL
0
2
are Borel measurable, x
0
His either nonrandom
or F-measurable, and W is a Q-Wiener process on H. This can be written as a
stochastic integral equation
X(t) = x
0
+
_
t
0
f (X(s)) ds +
_
t
0
g(X(s)) dW(s) .
Moreover, we assume that f and g satisfy the following Lipschitz condition:
(A) There exists a positive constant C such that
_
_
f (x) f (y)
_
_
H
C
_
_
x y
_
_
H
and
_
_
g(x) g(y)
_
_
L
0
2
C
_
_
x y
_
_
H
.
Therefore, according to Yor [185] and Miyahara [144], we have
Proposition 3.27. There exists a unique solution X of Eq. (3.39), which is a diffusion
with generator L:
Lh(x) =h
/
(x), f (x))
H
+(1/2)Tr
_
g

(x)h
//
(x)g(x)
_
.
Moreover, X has continuous paths, i.e.,
P
_
: lim
ts
_
_
X(, t) X(, s)
_
_
H
= 0
_
= 1.
106 3 An Introduction to Stochastic Differential Equations
Proof. See Yor [185].
3.5.2 L
2
-Bounded Solutions
Here, we are interested in studying L
2
-boundedness of the solution of the following
stochastic differential equations of the form
dX(t) =AX(t) + f (t, X(t)) dt +g(t, X(t)) dW(t), t R (3.40)
with the initial condition
X(0) = x
0
D(A) ,
where A : D = D(A) HH is a densely dened closed (possibly unbounded)
linear operator, f : RHHand g : RHL
0
2
are jointly continuous functions,
and W(t) is a Q-Wiener process with values in K.
In what follows we adopt the following assumptions:
(3H)
0
semigroup (T(t))
t0
that
|T(t)| Me
t
, t 0.
(3H)
1
i
lowing conditions are satised:
_
_
f (t, x) f (t, y)
_
_
C
1
_
_
x y
_
_
,
_
_
g(t, x) g(t, y)
_
_
L
0
2
C
2
_
_
x y
_
_
,
for all t R, x, and y H.
(3H)
2
There exists a constant C such that
_
_
f (t, x)
_
_
2
+
_
_
g(t, x)
_
_
2
L
0
2
C,
for all t R and x H.
For convenience, we recall from Ichikawa [104] two kinds of solutions of (3.40).
Denition 3.33. A stochastic process
_
X(t), t R
_
is said to be a strong solution
to (3.40) if
(i) X(t) is adapted to F
t
;
(ii) X(t) is continuous in t almost surely;
(iii) X(t) D for any t 0,
_
T

_
_
AX(t)
_
_
dt < almost surely for any T > 0,
and
The operator Ais the innitesimal generator of a uniformly exponentially stable
dened on H, that is, there exist constants M, >0 such
The coefcients f (, ) and g(, ) satisfy the following Lipschitz and linear
growth conditions: there exist positive constants C , i =1, 2 such that the fol-
3.5 Existence of Solutions of Stochastic Differential Equations in a Hilbert Space 107
X(t) = X(s) +
_
t
s
AX() d +
_
t
s
f (, X()) d +
_
t
0
g(, X()) dW()
for all t s with probability one.
In most situations, we nd that the concept of strong solution is too limited to
include important examples. There is a weaker concept, mild solution, which is
found to be more appropriate for practical purposes.
Denition 3.34. A stochastic process
_
X(t), t R
_
is said to be a mild solution to
(3.40) if
(i) X(t) is adapted to F
t
;
(ii) X(t) is continuous in t almost surely;
(iii) X is measurable with
_
T

_
_
X(t)
_
_
2
dt < almost surely for any T > 0, and
X(t) = T(t)X(s) +
_
t
s
T(t ) f (, X()) d +
_
t
s
T(t )g(, X()) dW()
for all t s with probability one.
Let p 2 and denote by L
p
(, H) the collection of all strongly measurable, p-th
integrable H-valued random variables. It is then routine to check that L
p
(, H) is a
Banach space when it is equipped with its norm dened by
|V|
L
p
(,H)
:=
_
E|V|
p
_
1/p
,
for each V L
p
(, H).
Let BUC(R; L
p
(; H)) stand for the collection of all processes X =
_
X(t), t
R
_
, which are bounded and uniformly continuous in L
p
(, H). We can, and do,
speak of such a process as a function X, which goes from R into L
p
(; H). It is
then easy to check that BUC(R; L
p
(; H)) is a Banach space when it is equipped
with a norm dened by
|X|

= sup
tR
|X(t)|
L
p
(,H)
.
In this section for simplicity we assume that p = 2.
We have the following well-known theorem.
Theorem 3.6. Suppose that the assumptions (3H)
0
, (3H)
1
, and (3H)
2
hold. Equa-
tion (3.40) has a unique uniformly continuous and L
2
-bounded mild solution X(t),
which can be explicitly expressed as follows:
X(t) =
_
t

T(t ) f (, X()) d +
_
t

T(t )g(, X()) dW()


for each t R whenever := 2M
2
_
C
1

2
+
C
2

_
< 1.
108 3 An Introduction to Stochastic Differential Equations
Proof. Dene an operator on BUC(R; L
2
(; H)) as follows:
X(t) =
_
t

T(t ) f (, X()) d +
_
t

T(t )g(, X()) dW() .


Let us rst show that X() is uniformly continuous whenever X is. Clearly, the
mappings f (, X()) and g(, X()) are continuous and uniformly L
2
-
bounded. That is, for any > 0, there is an h > 0 sufciently small such that
E
_
_
f (t +h, X(t +h)) f (t, X(t))
_
_
2
<

2
4M
2

and
E
_
_
g(t +h, X(t +h)) g(t, X(t))
_
_
2
L
0
2
<

2M
2
,
for all t R.
Then
E
_
_
X(t +h) X(t)
_
_
2
2E
_
_
_
_
_
t

T(t )
_
f ( +h, X( +h)) f (, X())

d
_
_
_
_
2
+2
_
_
_
_
_
t

T(t )
_
g( +h, X( +h)) g(, X())

dW()
_
_
_
_
2
.
Using assumption (3H)
0
, the CauchySchwarz inequality, and isometry identity, we
have
3.5 Existence of Solutions of Stochastic Differential Equations in a Hilbert Space 109
E
_
_
X(t +h) X(t)
_
_
2
2E
_
_
t

_
_
T(t )
_
_
_
_
f ( +h, X( +h)) f (, X())
_
_
d
_
2
+2
_
t

_
_
T(t )
_
_
2
E
_
_
g( +h, X( +h)) g(, X())
_
_
2
d
2M
2
E
_
_
t

e
(t)
d
__
_
t

e
(t)
_
_
f (+h, X(+h))f (, X())
_
_
2
d
_
+2M
2
_
t

e
2(t)
E
_
_
g( +h, X( +h)) g(, X())
_
_
2
d
2M
2
_
_
t

e
(t)
d
_
2
sup
R
E
_
_
f ( +h, X( +h)) f (, X())
_
_
2
_
+2M
2
_
_
t

e
2(t)
d
_
sup
R
E
_
_
g( +h, X( +h)) g(, X())
_
_
2
2
M
2

2
sup
R
E
_
_
f ( +h, X( +h)) f (, X())
_
_
2
+
M
2

sup
R
E
_
_
g( +h, X( +h)) g(, X())
_
_
2


2
+

2
=.
Next, we show that X() is L
2
-bounded. For a xed t R, we have
E
_
_
X(t)
_
_
2
2E
_
_
_
t

T(t s) f (s, X(s)) ds


_
_
2
+2E
_
_
_
t

T(t s)g(s, X(s)) dW(s)


_
_
2
= I
1
+I
2
.
Using assumption (3H)
0
, an application of the CauchySchwarz inequality, fol-
lowed by (3H)
2
, gives us
I
1
2M
2
E
__
_
t

e
(ts)
_
_
f (s, X(s))
_
_
_
2
_
2M
2
_
_
t

e
(ts)
ds
__
_
t
0
e
(ts)
E
_
_
f (s, X(s))
_
_
2
ds
_
2
_
_
t

e
(ts)
ds
_
2
sup
s0
E
_
_
f (s, X(s))
_
_
2
2C
M
2

2
.
As to I
2
, in a similar manner (with the additional help of It o isometry), we have
110 3 An Introduction to Stochastic Differential Equations
I
2
2
_
t

_
_
T(t s)
_
_
2
E
_
_
g(s, X
s
)
_
_
2
L
0
2
ds
2 M
2
_
t

e
2(ts)
E
_
_
g(s, X
s
)
_
_
2
L
0
2
ds
2 M
2
_
_
t

e
2(ts)
ds
_
sup
sR
E
_
_
g(s, X(s))
_
_
2
L
0
2
C
M
2

.
Combining, we conclude that
E
_
_
X(t)
_
_
2
2M
2
C
_
1

2
+
1

_
(3.41)
for all t R.
Finally, we will show that is a contraction. Let X and Y in BUC(R; L
2
(, H)).
Proceeding as before, we obtain
E
_
_
X(t) Y(t)
_
_
2
2E
_
_
_
_
_
t

T(t )
_
f (, X()) f (,Y())

d
_
_
_
_
2
+2E
_
_
_
_
_
t

T(t )
_
g(, X()) g(,Y())

dW()
_
_
_
_
2
.
Using assumption (3H)
0
, an application of the CauchySchwarz inequality, isome-
try identity, followed by (3H)
2
, gives
E
_
_
X(t) Y(t)
_
_
2
2M
2
_
_
t

e
(t)
d
__
_
t

e
(t)
E
_
_
f (, X()) f (,Y())
_
_
_
2
d
_
+2M
2
_
t

e
2(t)
E
_
_
g(, X()) g(,Y())
_
_
2
d
2M
2
C
1
_
_
t

e
(t)
d
__
_
t

e
(t)
E
_
_
X() Y()
_
_
2
d
_
+2M
2
C
2
_
t

e
2(t)
E
_
_
X() Y()
_
_
2
d
2M
2
C
1
_
_
t

e
(t)
d
_
2
sup
R
E
_
_
X() Y()
_
_
2
+2M
2
C
2
_
_
t

e
2(t)
d
_
sup
R
E
_
_
X() Y()
_
_
2
2M
2
_
C
1

2
+
C
2

_
_
_
X Y
_
_
2


_
_
X Y
_
_
2

.
3.5 Existence of Solutions of Stochastic Differential Equations in a Hilbert Space 111
Consequently, if < 1, then is a contraction mapping and this completes the
proof.
3.5.3 Stochastic Delay Differential Equation and Exponential
Stability
Let us now allow the coefcients of the stochastic differential equation (3.40) to de-
pend on values in the past. We then obtain the so-called stochastic delay differential
equation.
Here, we are interested in studying the following stochastic delay differential
equations of the form
dX(t) =AX(t) + f (t, X
t
) dt +g(t, X
t
) dW(t), t R
+
(3.42)
with the initial condition
X() =() C([, 0], H) ,
where A : D = D(A) HH is a densely dened closed (possibly unbounded)
linear operator, the history X
t
C

C([, 0], H) with > 0 (X


t
being dened by
X
t
() := X(t +) for each [, 0], f : R
+
C

H and g : R
+
C

L
0
2
are jointly continuous functions, and W(t) is a Q-Wiener process with values in K.
Here, C

is the space of continuous functions from [, 0] into H, equipped with the


sup norm given by
_
_
z
_
_
C

=
_
sup
0
_
_
z()
_
_
2
_
1/2
.
Such an equation is called a stochastic autonomous differential equation with nite
delay.
In what follows, in addition to (3H)
0
we require the following assumptions:
(3H)
3
i
following conditions are satised:
_
_
f (t, x) f (t, y)
_
_
K
1
_
_
x y
_
_
C

,
_
_
g(t, x) g(t, y)
_
_
L
0
2
K
2
_
_
x y
_
_
C

,
_
_
f (t, x)
_
_
+
_
_
g(t, x)
_
_
L
0
2
K
3
_
1+
_
_
x
_
_
C

_
,
for all x, y C

.
Denition 3.35. A stochastic process
_
X(t), t 0
_
is said to be a mild solution of
(3.42) if
(i) X(t) is adapted to F
t
;
The coefcients f (, ) and g(, ) satisfy the following Lipschitz and linear
growth conditions: there exist positive constants K (i =1, 2, 3) such that the
112 3 An Introduction to Stochastic Differential Equations
(ii) X(t) is continuous in t almost sure;
(iii) X is measurable with
_
T
0
_
_
X(t)
_
_
2
dt < almost surely for any T > 0, and
X(t; ) = T(t)(0) +
_
t
0
T(t s) f (s, X
s
) ds +
_
t
0
T(t s)g(s, X
s
) dW(s)
for all t 0 with probability one;
(iv) X(t) =(t), t 0, almost surely.
Here, since we are now mainly interested in the exponential stability of the mild
solution to (3.42), we introduce the notion of such stability. For the purposes of the
stability of (3.42), we shall assume that
f (t, 0) = g(t, 0) 0 for any t 0,
so that (3.42) admits a trivial solution when 0.
We denote a global mild solution of (3.42) corresponding to C

by X(t; ),
if one should exist.
Denition 3.36. X(t; ) is said to be exponentially stable in mean square if there is
a pair of positive constants and C such that, for any initial value C

,
E
_
_
X
t
(; )
_
_
2
C

C
_
_

_
_
2
C

e
t
for all t 0.
We obtain the following well-known theorem.
Theorem 3.7. Suppose that the assumptions (3H)
0
(3H)
3
hold. Then the mild so-
lution X(t; ) of (3.42) is exponentially stable in mean square whenever the positive
constants K
i
(i = 1, 2, 3) are small enough.
Proof. The proof of the existence and uniqueness of a mild solution of (3.42) is
omitted. It can be obtained by the well-known Picard iteration. For the sake of clar-
ity and completeness, the proof of the exponential stability of the solution is repro-
duced here even though many authors (e.g., Keck and McKibben [110], Luo [133])
obtained the stability with a more general equation than (3.42).
For a xed t 0, we have
E
_
_
X
t
(; )
_
_
2
C

3 sup
0
_
|T(t +)(0)|
2
+3E
_
_
_
_
_
t+
0
T(t + s) f (s, X
s
) ds
_
_
_
_
2
+3E
_
_
_
_
_
t+
0
T(t + s)g(s, X
s
) dW(s)
_
_
_
_
2
_
= I
/
1
+I
/
2
+I
/
3
.
Using (3H)
0
yields
3.5 Existence of Solutions of Stochastic Differential Equations in a Hilbert Space 113
I
/
1
3M
2
_
_
(0)
_
_
2
sup
0
_
e
(t+)

3M
2
_
_
(0)
_
_
2
e
(t)
.
Next, using again (3H)
0
, an application of the CauchySchwarz inequality, followed
by (3H)
3
, gives us
I
/
2
3M
2
sup
0
E
__
_
t+
0
e
(t+s)
_
_
f (s, X
s
)
_
_
_
2
_
3M
2
sup
0
___
_
t+
0
e
(t+s)
ds
__
_
t+
0
e
(t+s)
E
_
_
f (s, X
s
)
_
_
2
ds
___
3
M
2

sup
0
_
t+
0
e
(t+s)
E
_
_
f (s, X
s
)
_
_
2
ds
3
M
2

K
2
1
_
t
0
e
(ts)
E
_
_
X
s
_
_
2
C

.
As to I
3
, in a similar manner (with the additional help of It o isometry), we have
I
/
3
= 3 sup
0
_
_
t+
0
_
_
T(t + s)
_
_
2
E
_
_
g(s, X
s
)
_
_
2
L
0
2
ds
_
3 M
2
sup
0
_
_
t+
0
e
2(t+s)
E
_
_
g(s, X
s
)
_
_
2
L
0
2
ds
_
3 K
2
2
M
2
_
t
0
e
(ts)
E
_
_
X
s
_
_
2
C

ds .
Combining, we conclude that
E
_
_
X
t
_
_
2
C

3M
2
_
_
(0)
_
_
2
e
(t)
+3M
2
_
K
2
1

+K
2
2
_
_
t
0
e
(ts)
E
_
_
X
s
_
_
2
C

ds . (3.43)
Now, taking arbitrarily with 0 < < and T > 0 large enough, we obtain that
_
T
0
e
t
E
_
_
X
t
_
_
2
C

dt
3M
2
_
_
(0)
_
_
2
e

_
T
0
e
()t
dt
+3M
2
_
K
2
1

+K
2
2
_
_
T
0
e
()t
_
t
0
e
s
E
_
_
X
s
_
_
2
C

ds dt . (3.44)
On the other hand,
114 3 An Introduction to Stochastic Differential Equations
_
T
0
e
()t
_
t
0
e
s
E
_
_
X
s
_
_
2
C

ds dt =
_
T
0
_
T
s
e
s
E
_
_
X
s
_
_
2
C

e
()t
dt ds
=
_
T
0
e
s
E
_
_
X
s
_
_
2
C

_
_
T
s
e
()t
dt
_
ds

1

_
T
0
e
s
E
_
_
X
s
_
_
2
C

ds . (3.45)
Substituting (3.45) into (3.44) gives
_
T
0
e
t
E
_
_
X
t
_
_
2
C

dt 3M
2
_
_
(0)
_
_
2
e

_
T
0
e
()t
dt
+
K
4

_
T
0
e
s
E
_
_
X
s
_
_
2
C

ds , (3.46)
where
K
4
= 3M
2
_
K
2
1

+K
2
2
_
.
Since K
4
can be small enough by assumption, it is possible to choose a suitable
with 0 < <
K
4

such that
1
K
4

> 0.
Hence, letting T in (3.46) yields
_

0
e
t
E
_
_
X
t
_
_
2
C

dt
1
1
K
4

_
3M
2
_
_
(0)
_
_
2
e

_

0
e
()t
dt
_
K(, )
_
_

_
_
2
C

.
Therefore, we can deduce from (3.43) that
E
_
_
X
t
_
_
2
C

3M
2
_
_
(0)
_
_
2
e
(t)
+K
4
e
t
_
t
0
e
s
E
_
_
X
s
_
_
2
C

ds

_
3M
2
_
_
(0)
_
_
2
e

+K
4
K(, )
_
_

_
_
2
C

_
e
t
K
/
(, )
_
_

_
_
2
C

e
t
,
as desired.
3.6 Bibliographical Notes
In this chapter, we began by recalling some elementary denitions on probability
theory which can be found in any good textbook on probability. The material pre-
3.6 Bibliographical Notes 115
sented here on sequence of events, random variables, convergence of random vari-
ables, and conditional expectation was mostly taken from Billingsley [29], Casella
and Berger [34], Grigoriu [82], Mikosch [142], and Pfeiffer [154]. This enabled
us to introduce the theory of stochastic processes. The latter is based on non ele-
mentary facts from measure theory and classical functional analysis. The concept
of martingales was also discussed. The martingales constitute an important class
of stochastic processes. The subsections on continuity, separability, measurabil-
ity, stopping times, Gaussian processes, and martingales were taken from Bakstein
and Capasso [15], Bauer [17], Grigoriu [82], Kallianpur [107], M etivier [140], and
Mikosch [142]. The It o integral was subsequently introduced. Its denition goes
back to It o (19421944) who introduced the stochastic integral with a random inte-
grand. In 1953, Doob made the connection of It o integration and martingale theory.
It o integration plays a key role in constructing solutions of stochastic differential
equations. The presentation on It o integration given here follows closely that of
ksendal [150], Mao and Yuan [135], and Da Prato and Zabczyk [47]. In addition,
stochastic convolution integrals were introduced. They play an essential role in the
construction of stochastic partial differential equations involving sectorial operators.
The material used in our presentation on stochastic convolution integrals was taken
from Seidler [163]and Seidler and Sobukawa [164]. The concept of It o integral led
us to study stochastic differential equations in a separable Hilbert space. Stochastic
calculus discussed in this chapter remains valid in this space. The investigation for
stochastic differential equations has attracted considerable attention of researchers.
Recently, many authors have studied existence and uniqueness, boundedness, sta-
bility, and other quantitative and qualitative properties of solutions to stochastic dif-
ferential equations. One of the techniques to discuss these topics is the semigroup
approach. Many important results have been reported; see for instance Appleby [9],
Caraballo and Kai Liu [33], Fu [77], Ichikawa [104], Keck and McKibben [110],
Luo [133, 134], Kai Liu [106]. The material used in our presentation on stochastic
differential equations was collected from those sources.

Potrebbero piacerti anche