Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
VA 23187-8795, USA
Abstract
We present an algorithm for computing the probability density function of the product of two
independent random variables, along with an implementation of the algorithm in a computer
algebra system. We combine this algorithm with the earlier work on transformations of random
variables to create an automated algorithm for convolutions of random variables. Some examples
demonstrate the algorithm’s application.
c 2002 Elsevier B.V. All rights reserved.
1. Introduction
Rohatgi’s well-known result (1976, p. 141) for determining the distribution of the
product of two random variables is straightforward to derive, but dicult to implement.
Let X and Y be continuous random variables with joint PDF fX; Y (x; y). The PDF
of V = XY is
∞ v 1
fV (v) = fX; Y x; d x:
−∞ x |x|
∗ Corresponding author.
E-mail addresses: aa1275@usma.edu (A.G. Glen), leemis@math.wm.edu (L.M. Leemis),
jhdrew@math.wm.edu (J.H. Drew).
0167-9473/03/$ - see front matter c 2002 Elsevier B.V. All rights reserved.
PII: S 0 1 6 7 - 9 4 7 3 ( 0 2 ) 0 0 2 3 4 - 7
ARTICLE IN PRESS
2 A.G. Glen et al. / Computational Statistics & Data Analysis ( ) –
The implementation of this result, however, is not straightforward for general X and
Y . Springer (1979) presents a chapter on nding distributions of products of ran-
dom variables, relying mostly on Laplace and Mellin transformation techniques, as
implementation of the earlier result is often too cumbersome. Diculties occur as a
result of both the myriad of variations to the limits of integration and the propensity
of the PDF of V to be dened in a piece-wise manner. In this paper, we consider
the cases when X and Y are independent and may have probability density func-
tions (PDFs) dened in a piece-wise fashion. This simplication implies that the joint
PDF of X and Y will be dened on a rectangular product space, which can be eas-
ily broken into special cases based on geometry. This special case of Rohatgi’s re-
sult is implemented in a computer algebra system by enumerating all possible sets of
limits of integration. We extend the use of this algorithm for products of indepen-
dent random variables to create an algorithm for convolutions of independent random
variables. These algorithms and their implementations are important procedures in
A Probability Programming Language (APPL, Glen et al., 2001), a Maple-based soft-
ware package (Maple, 2001), whose purpose is to automate operations on random
variables.
2. Theorem
Considering a special case of Rohatgi’s result will illustrate some of the issues
associated with a general algorithm for determining the PDF of the product of two
independent random variables. For completeness, the theorem is proven from rst
principles (using the transformation technique) even though it could be stated that
it is a special case of Rohatgi’s result. Assume that the random variable X has
support on the interval (a; b) and the random variable Y has support on the in-
terval (c; d). Also, the product space of the two random variables is assumed to
fall entirely in the rst quadrant. Theorems and proofs for other rectangular sup-
port regions are similar. The algorithm described in Section 3 includes all possible
scenarios.
Theorem. Let X be a random variable of the continuous type with PDF f(x) which
is dened and positive on the interval (a; b), where 0 ¡ a ¡ b ¡ ∞. Similarly, let Y
be a random variable of the continuous type with PDF g(y) which is dened and
positive on the interval (c; d), where 0 ¡ c ¡ d ¡ ∞. The PDF of V = XY is
v=c
v 1
g f(x) d x; ac ¡ v ¡ ad;
x x
a
v=c v 1
h(v) = g f(x) d x; ad ¡ v ¡ bc;
v=d
x x
b
v 1
g f(x) d x; bc ¡ v ¡ bd;
v=d x x
ARTICLE IN PRESS
A.G. Glen et al. / Computational Statistics & Data Analysis ( ) – 3
when ad ¡ bc,
v=c
v 1
g f(x) d x; ac ¡ v ¡ ad;
a x x
h(v) = b
v 1
g f(x) d x; ad ¡ v ¡ bd;
v=d x x
when ad = bc, and
v=c v
1
g f(x) d x; ac ¡ v ¡ bc;
x x
a
b
v 1
h(v) = g f(x) d x; bc ¡ v ¡ ad;
x x
a
b
v 1
g f(x) d x; ad ¡ v ¡ bd;
v=d x x
when ad ¿ bc.
Proof. Only the case of ad ¡ bc is considered. The other cases are proven analogously.
Using the transformation technique (Hogg and Craig, 1995, p. 173), the transformation
Z =X and V =XY constitutes a 1–1 mapping from A={(x; y)|a ¡ x ¡ b; c ¡ y ¡ d} to
B = {(z; v)|a ¡ z ¡ b; cz ¡ v ¡ d z}. Let u denote the transformation and w the inverse
transformation. The transformation, inverse, and Jacobian are:
z = u1 (x; y) = x; x = w1 (z; v) = z;
v = u2 (x; y) = xy; y = w2 (z; v) = v=z;
1 0
J = = 1=z:
−v=z 2 1=z
The joint PDF of Z and V is
fZ; V (z; v) = f(w1 (z; v))g(w2 (z; v))|J |; (z; v) ∈ B
or
1
fZ; V (z; v) = f(z)g(v=z) ; (z; v) ∈ B:
|z|
Integrating with respect to z over the appropriate intervals and replacing z with x in
the nal result yields the marginal PDF of V
v=c
v 1
g f(x) d x; ac ¡ v ¡ ad;
x x
a
v=c v 1
h(v) = g f(x) d x; ad ¡ v ¡ bc;
x x
v=d
b
v 1
g f(x) d x; bc ¡ v ¡ bd;
v=d x x
as desired.
ARTICLE IN PRESS
4 A.G. Glen et al. / Computational Statistics & Data Analysis ( ) –
d
A3 xy = bd
A2
A1
c xy = bc
xy = ad
xy = ac
x
a b
bd
B3
bc
B2
ad
B1
ac
z
a b
The geometry associated with the transformation for the case ad ¡ bc considered in
the proof is shown in Figs. 1 and 2. The transformation maps Ai onto Bi for i = 1; 2; 3.
Although the transformation technique has been used to prove this theorem, the cdf
technique could also have been used. The theorem also follows directly from Ro-
hatgi’s result since fX; Y (x; v=x) = f(x)g(v=x), which is non-zero only when a ¡ x ¡ b
and c ¡ v=x ¡ d. Thus max{a; v=d} ¡ x ¡ min{b; v=c}, which yields the limits of inte-
gration displayed in the theorem. The geometry of the case ad ¡ bc is shown in Fig. 1
in which the regions A1 , A2 and A3 correspond to the three dierent integrals for that
ARTICLE IN PRESS
A.G. Glen et al. / Computational Statistics & Data Analysis ( ) – 5
case. Each Ai corresponds to a dierent interval of v for the family of hyperbolas given
by xy = v. Given v, the interval of integration consists of those values of x for which
the curve xy = v intersects the rectangle {(x; y) : a ¡ x ¡ b; c ¡ y ¡ d}.
3. Algorithm development
The theorem in the previous section illustrates the importance of considering the
magnitudes of the product of the coordinates of the southeast and northwest corners
of the product space [i.e., (b; c) and (a; d)] when it lies entirely in the rst quadrant.
In order to apply the theorem to any continuous random variables X and Y , three
generalizations need to be addressed.
(1) Analogous theorems must be written for the cases when the (a; b) by (c; d) rect-
angle lies wholly in one of the other three quadrants. Rectangles that lie in more
than one quadrant can be split into parts.
(2) Instead of having PDFs that are specied by a single standard function over their
entire support, the random variables X and Y may be dened in a piece-wise man-
ner over several intervals, forming many segments to the PDF (e.g., the triangular
distribution).
(3) The cases when 0 and/or ±∞ belong to the endpoints of the intervals which
constitute the support of X and Y must be considered (e.g., the normal distribution,
see Examples 4.3 and 4.4).
These generalizations result in 24 (6 for each quadrant) primary cases that must be
considered in order to correctly compute the limits of integration of the theorem. Fur-
ther, some of these 24 cases must be broken into as many as 3 subcases, as illustrated
in the theorem. In the rst quadrant (i.e., a ¿ 0 and c ¿ 0), for example, we must
consider the primary cases:
The paragraphs below provide more details. For quadrants II, III and IV, the limits of
integration must also be set based on the appropriate geometry, as in Fig. 1.
For random variables that are dened in a piece-wise fashion over various intervals,
let n be the number of intervals for X and let m be the number of intervals for Y . There
are mn rectangular “product spaces” and the contribution of each to the value of the
PDF of V = XY must be computed. Furthermore, each “product space” can contribute
dierently to the PDF of V on up to three segments of the support of V (see the
theorem). As a result, the PDF of V tends to become complicated very quickly, with
an upper limit of 3mn segments to its PDF. Furthermore, each of these segments ts
ARTICLE IN PRESS
6 A.G. Glen et al. / Computational Statistics & Data Analysis ( ) –
into one of the 24 cases of the integration mentioned above. For example, the product
of two U (0; 1) random variables yields a random variable V with only one segment.
But with only a slight change, e.g., X ∼U (1; 2) and Y ∼U (3; 4), yields a V =XY dened
dierently on three segments (see Example 4.1).
The case where the support of a random variable contains 0 (e.g., U (−1; 2)) poses
special diculty since some of the rectangular product spaces will not lie wholly
in any one quadrant and cannot be handled by the previously developed techniques.
Our solution to this diculty is to articially add 0 as one of the endpoints of the
intervals for X and Y whenever this case occurs, producing redundant segments, i.e.,
two segments on either side of zero with the same formula for the PDF. This method
allows for implementation when the support area crosses into more than one quadrant.
Example 4.3 gives an application of this special case, where the area of interest lies in
all four quadrants. The algorithm subdivides the total area into four components, each
entirely in one quadrant, and calculates the relevant portion of the new PDF for each
quadrant.
The algorithm consists of a set-up portion, followed by nested loops that determine
the contribution to the PDF of V = XY separately for each of the four quadrants.
The appendix contains the set-up portion and the algorithm associated with the rst
quadrant. The algorithm for the other quadrants is similar.
The set-up phase begins by setting n and m to the number of intervals that form the
support of X and Y . Next, 0 is added as an interval delimiter for X and/or Y if the
random variable can assume both positive and negative values, and 0 is not already
an interval delimeter. Finally, the endpoints of the intervals that form the support of
V are determined by taking all products of the endpoints of the X intervals times the
endpoints of the Y intervals.
A nested set of loops follows that treats all pairings of X and Y intervals. As shown
in Fig. 1, the coordinates (a; c) are assigned to the southwest corner of the current
rectangle of interest, and the coordinates (b; d) are assigned to the northeast corner
of the current rectangle of interest. A test to determine which quadrant contains the
current rectangle is made at this point. Adding 0 as an interval delimiter in the set-up
phase assures that the current rectangle will be completely contained in just one of the
quadrants. Once the quadrant is determined, tests on c and d determine which integrals
should be computed and the appropriate limits of integration. Finally, the insertion of
0 sometimes leads to a PDF for V with the same formula on both sides of 0. If this
occurs, the program simplies the PDF by removing 0 as an interval endpoint if the
function is dened at 0.
4. Examples
This section contains applications of using the APPL procedure Product, which is
our implementation of the algorithm described in the previous section.
Example 4.1. Consider the random variable X ∼U (1; 2) and the random variable
Y ∼U (3; 4). Find the distribution of V = XY .
ARTICLE IN PRESS
A.G. Glen et al. / Computational Statistics & Data Analysis ( ) – 7
This is a simple application of the algorithm. The following Maple code denes the
random variables X and Y and returns the PDF of their product. Note, the procedure
UniformRV returns the PDF in a Maple list-of-lists data structure outlined in Glen
et al. (2001):
X:= UniformRV(1, 2);
Y:= UniformRV(3, 4);
V:= Product(X, Y);
PDF(V);
These statements yield the following PDF for V = XY :
ln v − ln 3; 3 ¡ v ¡ 4;
h(v) = ln 4 − ln 3; 4 ¡ v ¡ 6;
3 ln 2 − ln v; 6 ¡ v ¡ 8:
Note that while the PDF of both X and Y are dened on single segments that have
positive interval limits, the PDF of V is dened on three segments.
Example 4.2. Consider the random variable X ∼Triangular(1; 2; 3) and the random vari-
able Y ∼Triangular(−2; 1; 2). Find the PDF of V =XY . Since X and Y are each dened
on two segments, this non-uniform example illustrates the case of several (four) rect-
angular product spaces. The APPL code in this case is:
X := TriangularRV(1, 2, 3);
Y := TriangularRV(1, 2, 4);
V := Product(X, Y);
PDF(V);
The resulting PDF for V has six segments:
−4v=3 + (2=3) ln v + (2v=3) ln v + 4=3; 1 ¡ v ¡ 2;
−8 + (14=3) ln 2 + (7v=3) ln 2 + 10v=3 − 4 ln v
−(5v=3) ln v; 2 ¡ v ¡ 3;
−4 + (14=3) ln 2 + (7v=3) ln 2 + 2v − 2 ln v
−v ln v − 2 ln 3 − (2v=3) ln 3; 3 ¡ v ¡ 4;
h(v) = 44=3 − 14 ln 2 − (7v=3) ln 2 − 8v=3 − 2 ln 3
+(22=3) ln v − (2v=3) ln 3 + (4v=3) ln v; 4 ¡ v ¡ 6;
8=3 − 8 ln 2 − (4v=3) ln 2 − 2v=3 + (4=3) ln v
+(v=3) ln v + 4 ln 3 + (v=3) ln 3; 6 ¡ v ¡ 8;
−8 + 8 ln 2 + (2v=3) ln 2 + 2v=3 + 4 ln 3
−4 ln v + (v=3) ln 3 − (v=3) ln v; 8 ¡ v ¡ 12:
Example 4.3. Consider the random variable X ∼N (0; 1) and the random variable
Y ∼N (0; 1). Find the PDF of V = XY . This example illustrates the case of 0 in the
ARTICLE IN PRESS
8 A.G. Glen et al. / Computational Statistics & Data Analysis ( ) –
f (x)
0.8
0.6
0.4
0.2
–3 –2 –1 0 1 2 3
x
support of X and Y and also the case where the support of X and Y includes the
endpoints ±∞.
The program yields the following PDF for V :
BesselK(0; v · signum(v))
; −∞ ¡ v ¡ 0;
h(v) =
BesselK(0; v · signum(v))
; 0 ¡ v ¡ ∞;
which relies on Maple’s BesselK and signum functions. An APPL plot of this function
is given in Fig. 3.
Example 4.4. Consider the independent random variables U1 ∼U (0; 1) and U2 ∼U (0; 1).
The Box–Muller algorithm for generating a single standard normal deviate V can be
coded in one line (Devroye, 1996) as
V ← −2 ln U1 cos(2U2 );
where U1 and U2 are independent random numbers. Using the Transform (Glen et al.,
1997) and Product procedures together, one can determine the PDF of V . Due to the
principle inverse diculty with trigonometric functions, however, the transformation
must be rewritten as
V ← −2 ln U1 cos(U2 )
ARTICLE IN PRESS
A.G. Glen et al. / Computational Statistics & Data Analysis ( ) – 9
before using Transform. The APPL code to compute the distribution of V is:
U1 := UniformRV(0, 1);
U2 := UniformRV(0, Pi);
T1 := Transform(U1, [[x -> sqrt(-2 * ln(x))], [0, 1]]);
T2 := Transform(U2, [[x -> cos(x)], [0, Pi]]);
V := Product(T1, T2);
This APPL code yields the following PDF for V :
0 −v2 =(2x2 )
v e
√ d x; −∞ ¡ v ¡ 0;
−1 1 − x2 x2
h(v) =
v 1 e−v2 =(2x2 )
√ d x; 0 ¡ v ¡ ∞:
0 1 − x2 x2
While this form in not easily recognizable as the PDF for the normal distribution, it
is mathematically equivalent to the more standard
1
h(v) = √ e−v =2 ; −∞ ¡ v ¡ ∞:
2
2
As a nal example, the Product procedure can be used in various types of statistical
inference, as illustrated here in hypothesis testing.
Example 4.5 (Hogg and Craig, 1995, p. 287). Let X1 and X2 be iid observations drawn
from a population with PDF
f(x) = x−1 ; 0 ¡ x ¡ 1;
where ¿ 0. Test H0 : = 1 versus H1 : ¿ 1 using the test statistic X1 X2 and
the critical region C = {(X1 ; X2 )|X1 X2 ¿ 3=4}. Find the signicance level and power
function for the test.
The APPL code to compute the power function is:
n := 2;
crit := 3 / 4;
assume(theta > 0);
X := [[x -> theta * x ^ (theta - 1)], [0, 1], ["Continuous", "PDF"]];
T := ProductIID(X, n);
power := SF(T, crit);
which yields
Pr(rejecting H0 |) = 1 − (3=4) + (3=4) ln(3=4):
The fact that the population distribution is non-standard indicates that X must be de-
ned using the list-of-three lists data structure shown above. The assume statement
denes the parameter space. The ProductIID procedure simply makes repeated calls
to Product. Finally, the SF procedure returns the survivor function, which is the com-
plement of the CDF. To compute the signicance level of the test, the additional Maple
statement
subs(theta = 1, power);
ARTICLE IN PRESS
10 A.G. Glen et al. / Computational Statistics & Data Analysis ( ) –
5. Extensions
Occasionally, Maple returns a PDF in a less than optimal form. The expression
ln(−v)−I, for example, may occur as part of a PDF. Since ln(−1)=I, the expression
is equal to ln v. Also the PDF may involve a complicated integral, which Maple fails to
recognize is equal to a much simpler standard form (see Example 4.4). We anticipate
that future generations of computer algebra systems will overcome these diculties.
ARTICLE IN PRESS
A.G. Glen et al. / Computational Statistics & Data Analysis ( ) – 11
The algorithms that produce the product and convolutions of random variables greatly
increase the “feasible” realm of random variable algebra. Summing 50 independent
uniform random variables is straightforward in theory, but the implementation is not
possible by hand. There are many cases, though, where the integration required by
products and convolutions are intractable and thus the algorithms appear to fail. In
fact, it is the underlying inability of the program (Maple) to implement the algorithm
in APPL that produces the failure. In general, there are three classes of resultant
distributions that come from products and convolutions via this algorithm. Class A
is the set of resultant distributions that are completely tractable, such as the sum of
ve uniform random variables. Here the PDF is in closed form, the CDF is calculable,
quantiles can be calculated, and plots of distributions are easily produced. The second
set, Class B, is the set of resultant distributions that are not in closed form, but are
in forms that Maple can still evaluate, such as PDFs that rely on the erf function or
the BesselK functions of Maple. Example 4.3 is one such case, where the non-closed
form PDF is in terms of the BesselK function, which is used in solving systems of
dierential equations. In Class B, it is often possible to plot the PDF and calculate
quantiles. Finding a usable form for the CDF, however, might not be possible. The
nal class of problems, Class C, is the set that only appears as un-evaluated integrals.
The convolution of three Weibull random variables is one such example. Here the
resulting PDF has little use, as Maple cannot plot the function, and cannot evaluate
critical points very eectively.
In conclusion, an algorithm for calculating the PDF of the product of two inde-
pendent random variables X and Y (which may be dened in a piece-wise manner)
has been developed and implemented. The algorithm has also been extended to calcu-
late the PDF of convolutions of independent random variables. The APPL procedure
Product is one of many procedures capable of automating complicated probability cal-
culations associated with random variables (Glen et al., 2001). Potential applications
for calculations of this type occur in all areas of applied probability and statistics.
Acknowledgements
The authors gratefully acknowledge the assistance of Professor Hank Krieger in the
early work associated with this paper and Professor Donald Barr for his assistance
throughout the project. The authors acknowledge the helpful comments from the re-
viewers and Professor Diane Evans. The second author acknowledges support from the
The College of William and Mary’s Faculty Research Assignment program.
Appendix
This appendix gives the details associated with the algorithm that computes the distri-
bution of the product of two continuous, independent random variables. The algorithm
has been implemented in Maple and is one of the procedures in APPL. Cardinality is
denoted by · and the empty list is denoted by [ ]. Indentation is used to indicate
ARTICLE IN PRESS
12 A.G. Glen et al. / Computational Statistics & Data Analysis ( ) –
loops and conditions. The conditions prior to the calculation of the integrals are to
avoid calculating integrals unnecessarily.
Procedure product
Input: The distribution of the random variable X and the distribution of the random
variable Y , where X and Y are independent, univariate, continuous random variables.
The PDFs are in the list-of-lists format, i.e., the random variable X is represented by the
data structure [f̃(x), x? , ["Continuous", "PDF"]], and the random variable Y is
represented by the data structure [g̃(y), y? , ["Continuous", "PDF"]], as described
in Glen et al. (2001). The rst sublist for X contains a list of the piece-wise f(x)
functions, that is, f̃(x)=[f1 (x); f2 (x); : : : ; fn (x)]. The second sublist for X contains
a list of the endpoints of the support of X , that is, x? =[x1? ; x2? ; : : : ; x?n+1 ], where
f(x) = fi (x) for xi? ¡ x ¡ xi+1 ?
, i = 1; 2; : : : ; n.
Output: The PDF of V = XY , h(v), in the same list-of-lists format as the input.
The list h̃(v)=[h1 (v); h2 (v); : : : ; hl (v)] contains the piece-wise elements of the PDF
of V and v? =[v1? ; v2? ; : : : ; vl+1
?
] contains the endpoints of the support of V . The
PDF is determined by a special case of Rohatgi’s result for two independent random
variables
∞ v 1
h(v) = f(x)g d x:
−∞ x |x|
Algorithm
n ← x? − 1 n is the number of segments of the PDF of X
m ← y? − 1 m is the number of segments of the PDF of Y
if (x1? ¡ 0 and x? n+1 ¿ 0 and 0 ∈ x ) then
?
insert 0 into x? if necessary
for i ← 1 to n
if (xi? ¡ 0 and xi+1
?
¿ 0) then
insert 0 between positions xi? and xi+1?
c ← yj?
d ← yj+1 ?
References
Devroye, L., 1996. Random variate generation in one line of code. In: Charnes, J., Morrice, D., Brunner,
D., Swain, J. (Eds.), Proceedings of the 1996 Winter Simulation Conference. Institute of Electrical and
Electronics Engineers, Coronado, CA, Institute of Electrical and Electronics Engineers, Piscataway, NJ,
pp. 265–272.
Glen, A., Leemis, L., Drew, J., 1997. A Generalized univariate change-of-variable transformation technique.
INFORMS J. Comput. 9 (3), 288–295.
Glen, A., Evans, D., Leemis, L., 2001. APPL: a probability programming language. Amer. Statist. 55 (2),
156–166.
Hogg, R.V., Craig, A.T., 1995. Mathematical Statistics, 5th Edition. Prentice-Hall, Englewood Clis, NJ.
Maple Version 7, 2001. Waterloo Maple, Inc. Waterloo, Canada.
Rohatgi, V.K., 1976. An Introduction to Probability Theory Mathematical Statistics. Wiley, New York.
Springer, M.D., 1979. The Algebra of Random Variables. Wiley, New York.