Sei sulla pagina 1di 11

Haihao Liu

12W

IB Mathematics Higher Level

Candidate No.: 2631

Investigating the origins and applications of Eulers number e in


probability theory and statistical distributions

Introduction to Eulers number and its applications


Eulers number, denoted e, is in many ways a unique and fascinating number, with
many interesting and often unexpected properties. Like , it is an irrational number, meaning
its decimal places dont recur, and also a transcendental number, meaning it isnt algebraic,
in other words, not the solution to any polynomial equation with rational coefficients.
Although it can be defined or expressed exactly in several different ways, its decimal
representation begins 2.718281828459045 It is often characterized as the base of the
natural logarithm, ln x. Two ways to express it exactly are as follows: the limit of 1 +
n approaches infinity, and as the sum of the infinite series = 1 +

!
!!

!
!!

!
!!

! !

as

+ . The

constant can be defined in many ways; for example, e is the unique real number n such that
the derivative of the function nx (the exponentiation operation) is equal to the function itself. It
is also the unique positive real number n such that

!!

! !

= 1.

The number e is considered one of the most important and fundamental constants in
mathematics, with applications in areas including, but not limited to: calculus, compound
interest, complex analysis, exponential growth and decay, probability, and statistics. Thus,
the ubiquitous nature of e can be seen, often mysteriously appearing in unexpected
situations. I first came across various factoids demonstrating this while reading books on
popular mathematics, by authors such as Rob Eastaway and Professor Ian Stewart. I read
those books because I am naturally fascinated by these sorts of math, and found the topics
discussed all quite interesting. However, I distinctly remember being completely astounded
by the many properties of e. One particular example, which leads directly into the topic of my
exploration, is in Bernoulli trials, a type of experiment whose outcome is random and can be
only one of the two possible, success and failure. Suppose the chance of success on
each trial is

!
!

and n trials are performed. Then, for large n (i.e.

!
!

is very small), the


!

probability of getting no successes in all n Bernoulli trials is approximately . This will be

explained and explored in more detail later. Another bizarre place in which e (and ) comes
up is in Stirlings approximation for large factorials: ! ~ 2

! !

. This too will be used later

in my exploration. It is for reasons like these that I was initially intrigued and decided to look
into this number further. The more I researched, the more I was amazed. Thus I decided that
e would form the topic of my exploration. I found that e also appears in the probability
functions of various distributions. It seemed not obvious to me at all why a seemingly
arbitrary number defined by very abstract concepts such as limits and infinite series, and
found in calculus and complex analysis, should have any relation to probability and statistics,
which is largely grounded in and based on modeling real-life observations. I will first discuss
the binomial distribution, leading then into the Poisson distribution, which contains the
constant e in its probability mass function. I will go through the process of deriving said
function, so that we may see the origin of e in them. Hopefully, this will allow me to gain a
deeper understanding of the nature of e, and the role it plays in mathematics.

Haihao Liu

12W

IB Mathematics Higher Level

Candidate No.: 2631

Preliminary information on relevant distributions


Introduction to probability distributions
In probability and statistics, a distribution, or more properly a probability distribution,
assigns a probability to each measurable (countable) subset in the set or range of possible
outcomes in a random experiment. Random experiment is a very broad, general term, and
includes everything from tossing a coin n times, to measuring kinetic energy of particles in
a gas at a given temperature. The range of possible outcomes for each experiment is
equally disparate: e.g. 140 to 190 cm for heights of students, versus 0 to 30 phone calls in
one hour. The probability distribution should be thought of as a function, where the input
variable is the outcome, and the output is the probability, and can be graphed visually.
Below: Various probability distributions, including both discrete and continuous

Figure 2: Heights (Normal)

Figure 1: Coin Tosses (Binomial)

Figure 4: Kinetic Energy (Maxwell-Boltzmann)

Figure 1: Phone Calls (Poisson)

As will become evident soon when explaining Bernoulli trials, process, distribution,
etc., subtleties in terminology must be dealt with care when discussing this topic, and key
ones ought to be explained. One such distinction is with the types of probability distributions.
Generally speaking, there are two major types of distributions, with a fundamental difference.
Those dealing with discrete random variables (e.g. number of something, or times an event
occurs) are known as discrete probability distributions, modeled by whats called probability
mass functions (PMF). On the other hand, continuous random variables (e.g. height, weight,
velocity, etc.) follow a continuous probability distribution, which is a probability density
function (PDF). Confusion may arise, as both PMFs and PDFs are probability distribution
functions (also PDF; note this more broad sense will not be used here). Correct usage of
these terms is important, as I hope to work to rigorous mathematical standards.

Haihao Liu

12W

IB Mathematics Higher Level

Candidate No.: 2631

Link with Bernoulli trials and binomial distribution


The situation described above of e arising from Bernoulli trials naturally links to what
are known as the Bernoulli and binomial distributions. I believe a proper understanding of
this is vital, as it is the basis from which we will derive the Poisson distribution, the PMF of
which, as mentioned involves e. To begin, we must have a more formal mathematical
definition of a Bernoulli trial. The outcome of each Bernoulli trial (a Bernoulli variable, say, Xi)
is a component of a Bernoulli process. A Bernoulli process is a sequence, finite of infinite, of
discrete, independent, random variables that only one of two values, either 0 or 1. More
simply, the process can be thought of as a repeated coin flipping, with each trial being a coin
flip, and each outcome or variable is either heads or tails. The coin could be biased, as
long as its consistently biased, i.e. the probabilities dont change.
In such Bernoulli processes, it can be said that the associated variables all follow the
same Bernoulli distribution. Mathematically, this can be written X ~ Bern(p), where X
represents a random variable, the ~ indicating follows (a certain distribution), and Bern(p)
represents the Bernoulli distribution with parameter p, which is, by convention, the probability
of getting on each trial, a success, 1 or heads, etc., 0 p 1. This notation will be used
again frequently in this paper.
Because the Bernoulli distribution is only looking at each instance of a random
variable on its own, there are only two possible outcomes, and the probabilities are simply p
and (1 p). It is not a particularly interesting probability distribution; far more interesting,
useful and common is binomial distribution, of which the Bernoulli distribution is a special
case. The binomial distribution is a discrete probability distribution, giving the probabilities of
the total number of successes in a sequence of n independent yes/no experiments (or better,
Bernoulli trials, as explained above), with the probability of success on each trial being p.
Thus, for a random variable X that follows a Bernoulli distribution, the notation given above is
tantamount to writing X ~ B(1,p), i.e. n = 1.
In general, if a random variable X follows the binomial distribution with parameters n
(total number of trials in sequence) and p (probability of success on each trial), we can
notate this as X ~ B(n, p). The probability of getting exactly x successes in n trials is given by
the following probability mass function, the first useful, non-trivial one that shall be
introduced in this paper:
=P = =

!
(1 )!!!

for 0 x n, x . (Note that sometimes q is used in place of (1 p), such that

there are three parameters, n, p and q.) The


is the combinations, or nCr (n choose r),

function, and is the binomial coefficient in binomial expansions, hence also the namesake of
the binomial distribution.
It is fairly straightforward to understand the origin of this probability mass function, as
shall be briefly explained below. The probability of getting x successes is px, and n x
failures is (1 p)n x. However, since the x successes can occur in any order and place in
the n trials, we must take into account the number of combinations, with the combinations

function, i.e. there are


ways to distribute the x successes in a sequence of n trials.

Haihao Liu

12W

IB Mathematics Higher Level

Candidate No.: 2631

The Poisson distribution


Link and Introduction to Poisson distribution
The Poisson distribution is another discrete probability function that gives the
probability of a certain number of events occurring in a fixed interval of time or space, if you
already know the mean number of events in that fixed interval, and the events occur
independent of when the last occurred. It is related to the binomial distribution in that it is the
special case where the parameter n tends to infinity and p tends to zero; with np being the
aforementioned mean number of events, denoted by m (or ). The Poisson distribution has
only one parameter, that is, m 0.
Below are some examples of discrete phenomena (non-continuous, usually meaning
integral number of times) that can be modeled using the Poisson distribution. Note the wide
range of occurrences and applications in very diverse fields.

The number of soldiers killed by horse-kicks each year in the Prussian cavalry (a
classic example, used by Simon Poisson, for whom the distribution is named)
The number of misprints per page of a book
The number of phone calls arriving at a call center per hour
The number of mutations in a stretch of DNA caused by radiation
The number of cars arriving at a traffic light per minute

Because in real life situations, there cant be infinite trials, nor can the probability of
an event be zero (at least without being pointless), the Poisson distribution is used to
approximate the binomial distribution when n is large and p is small. The binomial PMF was
quite tedious to use to work out probabilities manually, and even after the advent of
computers, required more processing power. Though that is less of an issue now, we still
use the Poisson approximation simply because its good enough. A real-life example of this
use is in predicting the number of times a web server is accessed at any given minute, to
design systems that can handle the flow of web traffic. The whole business of large n and
small p may be sounding familiar from my introductory paragraph, and indeed, that is the link
that the Bernoulli trials have with the Poisson distribution.
So far, there has yet to be an appearance of the topic of this paper, Eulers number e,
again after the introduction, amidst all the explanations about probabilities and distributions.
However, I believe it will soon be seen that discussing those topics is important, even crucial,
in investigating and trying to derive the PMF of the Poisson distribution, which, the reader
might remember, involves e. As we were speaking earlier of the Bernoulli trials, when n is
large and p is

!
!

(i.e. very small), the probability of getting no successes in all n Bernoulli


!

trials is approximately . We can calculate and verify this using the PMF of the Poisson

distribution, as this is a situation in which it can directly be applied; we simply substitute the
correct parameter in the formula below.
In general, if a random variable X follows the Poisson distribution with parameter m
(the mean number of successes in a fixed interval of time or space), we can express this as
X ~ Po(m). The probability of getting exactly x successes (or times an event occurs) in the
given interval is given by the following probability mass function:

Haihao Liu

12W

IB Mathematics Higher Level


=P = =

Candidate No.: 2631

!! !
!

for x , and x! being the factorial of x. As can be seen, the topic of this paper,
Eulers number e, finally makes an appearance in this formula.
To apply the Poisson distribution PMF formula to our now twice-mentioned example,
we must first work out the relevant parameter m and argument x. The mean number of
!

successes m in n trials, with the probability p being , would have the expected value of np =
nx

!
!

= 1. Since m = np, m = 1 and X ~ Po(1). To find the probability of getting no successes

in all n trials, we set x = 0. Thus, P = 0 =

! !! !!
!!

! !! !
!

= , as asserted.
!

The Poisson distribution can be applied to random experiments (wherein the


sequence of outcomes does not follow any sort of pattern) with many possible outcomes,
each of which is rare. The law of rare events comes into play here, and will be looked into
in detail in the next section.
The law of rare events or Poisson limit theorem
To understand how the Poisson distribution is derived, we must understand what is
known as the law of rare events, also known as the Poisson limit theorem. The rare events
alludes to the low probability of an individual event occurring in a random experiment.
Because each event is so rare, if we perform a large number of trials, in other words, it has
very many opportunities to happen, wed expect the mean number of occurrences to be of
moderate magnitude. So conceptually, the PMF modeling these events will tend towards
some smoother, intermediate function.
Essentially, each event can be thought of as a Bernoulli trial (the idea is more
rigorously justified below), and so its more precisely modeled by a binomial distribution. But
when the probability p is so small, and the number of events n is so large, the Poisson limit
theorem states that the Poisson distribution can be used to approximate it (Figure 5).

Figure 5: As n increases, the Poisson distribution


becomes a better and better approximation (m = np
= 5 for all) Note: Density(k) refers to the relative
likelihood for a random variable to take on a given
value k, as calculated by the PMF

Haihao Liu

12W

IB Mathematics Higher Level

Candidate No.: 2631

Suppose we know the mean value, m. There are 3 main simplifying assumptions one
must make to define a Poisson process, and allow us to think of it as the limiting case of the
binomial distribution:

The number of events occurring in non-overlapping time intervals is independent.


The probability of an event occurring in a very short time interval of length h is mh.
The probability of having more than one single car arriving in a very short time
interval is essentially zero.

Assumption 1 means knowing how many events occur in a given time interval does
not influence how many will occur in the next. Assumption 2 means the number of events
occurring depends only on the length of the interval, and not when it occurs. Also, it means
the probability in each short time interval is identical. Assumption 3 allows us to treat a very
short time interval as a Bernoulli trial, with happening as a success, and ensures that only
one success or failure will occur. Overall, the three assumptions imply that these short
intervals are independent Bernoulli trials with identical probability of success, giving us the
basis for applying the binomial distribution as the starting point for our derivation.
Derivation of Poisson distribution (Proof of the law of rare events)
Mathematically, the law of rare events, or Poisson limit theorem, states that if
, 0, such that then
!
!! !
!
(1 )!!! =
! (1 )!!! =
.

! !
!
The proof is as follows: Firstly, because we are dealing with large factorials, we can
replace n! with Stirlings approximation: ! ~ 2

! !

. Here, ~ denotes is asymptotically

equal to, appropriate since we are working with limits.


!
2
!

! (1 )!!! ~

! !
2( )

!!!

! (1 )!!!

Simplifying fractions:
!

2( )

!!!

! ! (1 )!!!

! (1 )!!! =

( )

!!! ! !

! ! (1 )!!!
!!! ! !

since (n x) ~ n as n , we are able to cancel out those square root terms; and
!!!(!

!!! )

= !!!!!! = !! =

!
!

Because np m, we can replace p with m/n:


!
!!!
!
! ! (1 )!!! (1 )
=
!!! ! !
!!! ! !

!!
!(!!!)

Haihao Liu

12W

IB Mathematics Higher Level

Candidate No.: 2631

Now divide the first factor in both the numerator and denominator by nnx.
! !

(1 )!!! !
(1 )!!!
! (1 )!!!
!!!

=
=
!!! !
!!! !
!!! !
!
1
!
1
!

with

!!
!!!!

= !!(!!!) = ! , similar to above. Now again, since (n x) ~ n as n :


!!!

)
! (1 )!

~
!!! !
! !
1
!
1
!

! (1

Now we must evaluate the following limit, which involves indeterminate forms:
lim 1

!!

In order to do so, I had to research and learn various techniques and transformations
of limits, in order to deal with and evaluate all the different types. Indeterminate form of type
1! : Transform using
!"# ! !" !

lim ! = !!

!!

lim 1
!!

!"# ! !" !!

= !!

!
!

Indeterminate form of type 0 : Let = , then


!

lim ln 1

!!

1
!"# ! !"
= lim ln 1 !!
!!

!!

!
!

!"#

= !!

!" !!!"
!

Indeterminate form of type 0/0: Applying LHpitals rule, which states that for two
functions and that are differentiable on \{}, where is an open interval containing ,
and \ denotes but not in, without, or relative complement, with regards to sets: If
lim () = lim () = 0 or , and

!!

!!

lim

()

!! ()

exists, and

! 0 \ {} ( denoting for all), then


()
()
= lim
. So, we have:
!! ()
!! ()
lim

ln 1
lim
= lim
!!
!!

ln 1
!"
1
1
!"#

= lim

!!

!!

Factoring out constants:


!"#

!!!"!! =

! !"#

!!!"!!

!
! !!

= !!

!!!"
!

!"#

= !!!"!!

Haihao Liu

12W

Ditto for lim 1


!!

IB Mathematics Higher Level

Candidate No.: 2631

= !! . Therefore:

!
)
! !!
!! !

~
=
; Q.E.D.
!! !
! !
!
1
! !

! (1

In the final steps involving taking the limit, I have finally explicitly seen the direct
origin of the e in the Poisson PMF. I noticed the similarities with the limit definition of e: the
limit of 1 +

! !
!

as n approaches infinity. I began to see some of the real world connections,

in particular to compound interest rates, where we can think of the continuous compounding
as a limiting case (represented by the aforementioned limit from which e arises). Moreover,
much like how the Poisson distribution approximates the binomial when n is large and p is
small, the formula for continuous compounding, which you can see contains e, A = Pert
(where A is future value, P is principal or initial amount, r is annual nominal interest rate, and
t is number of years money borrowed) approximates the future value when r is very small but
number of times the interest is compounded per year is very large.

Stirlings approximation
Derivation of Stirlings approximation
In the derivation above of the Poisson distribution PMF from the binomial distribution
PMF, we replaced the factorials n! and (n-k)! of the nCr function with Stirlings approximation
for large factorials. I have decided that it is worth proving the formula below, firstly to see
why we are justified in using said approximation (by considering the fact that we are taking
the limit as x for both functions), and secondly to investigate the interesting fact that e
(and even ) once again unexpectedly appears in the formula. So, we must prove that for
large values of n,
! ~ 2

Recall the definition of the factorial function:


! = 1 2
Take the natural log of both sides, as this turns n! into a slowly varying function,
meaning it will converge at infinity. I noticed that since e is the base of the natural logarithm,
it is already coming into play here:
ln ! = ln 1 + ln 2 + + ln
!

ln
!!!

Because n is large, approximate the summation with an integral:


!

ln !

ln d
!

Haihao Liu

12W

IB Mathematics Higher Level

Candidate No.: 2631

Solve using integration by parts: Let


= ln d =

1
d

=1=
d
ln d =

1
d

d = ln
!

= ln

1 d = ln +
!

Putting the anti-derivative back into the expression:


!

ln !

ln d
!

= ln

!
!

= ln + 1
The + 1 term can be ignored, since it becomes insignificant as n :
ln ! ln
To express in more familiar format, exponentiate both sides. Here, I see this is where
the e comes from in Stirlings formula, at least using this derivation:
! ~ !" !

! !!

= ! !! =

Better approximations may be found by adding more terms after the n (method
and explanation for finding these terms involve advanced mathematical theorems, such as
the Euler-Maclaurin formula and Walliss product, both of which are are beyond the scope of
this paper). Adding the next term, we get:
1
ln ! ln + ln(2)
2
Again, after exponentiating both sides, we get the familiar format of the formula:
! ~ !" !

! !!!!"

!!" !

= ! !! 2

!
!

= 2

the most common format for Stirlings approximation, Q.E.D.


An alternative, more precise variant of writing the asymptotic formula is as follows:
lim

!!

!
2

=1

Haihao Liu

12W

IB Mathematics Higher Level

Candidate No.: 2631

Reflection
I had set off in this exploration to gain a deeper understanding and appreciation of
the role Eulers number e plays in statistics and probability theory. I decided that the best
way to achieve this was by teaching myself the fairly advanced skills and techniques in the
relevant areas of math to be able to derive the probability mass function of a distribution, and
more importantly, genuinely understand every step in the proof. I believe I can honestly say I
have fulfilled this aim.
I chose to investigate the Poisson distribution, firstly because it obviously involves e,
which is a requisite, but primarily because I felt it leaded naturally from an area I already
knew a bit about, namely, Bernoulli trials and the binomial distribution. At times, the math
involved in deriving the formula was rather complex and difficult to understand, especially
because in the formal published papers and sources I used, written by and for professional
mathematicians, often contained a lot of shortcuts or left things unexplained. I had to
personally research anything that didnt make complete sense, until I was satisfied myself
that I truly comprehended.
From my work in this exploration researching the derivation of the Poisson PMF, I
feel that I have really gained somewhat of a better understanding of the nature of e and why
it occurs so ubiquitously. In my introduction, I noted how strange it seemed to me that e,
which is defined by abstract concepts as a purely mathematical construct, has uses in
statistics, which is based on real life. I am now able to reconcile the two ideas, at least in my
head. The reason e appears is because, often in real life, we deal with very large numbers
using approximations. We dont need to know, for example, the exact number of bacteria in
a sample, nor can we. Like my example of continuous compound interest, these are
approximate models of the real world, and often we need simplified models because the real
world is simply too complex. For example, we see this in economic models, or in physics,
where small angle approximations are frequently used. This is where, mathematically, limits
come into play. And as a consequence, in many cases, so does e.

Conclusion
Eulers number e is indeed one of the most remarkable, unique and fascinating
numbers in all of mathematics, and the work and research I did have utterly convinced me of
that. It is so much more than just a simple, arbitrary number: with its manifestations and
applications ranging far and wide, e is a truly universal and fundamental constant. Just
within this exploration, I looked into two, honestly quite disparate, and ostensibly unrelated
occurrences of e. The Poisson distributions PMF, in which e features, is a model for real
world phenomena, while Stirlings formula is an approximation of the factorial n!, largely used
in combinatorics and number theory.
However, as we have seen and demonstrated, the two are intrinsically linked, and is
most readily appreciated when we go back to the roots and investigate from where all these
formulae come about. Having a real understanding of any topic can often allow an individual
to make more profound insights. The fact that e links together so many diverse areas of
mathematics can be thought of, in a broader sense, to apply to areas in life beyond math.
Everything is interconnected in some fashion, and if one takes the time to look for and
understand these connections, they gain a deeper appreciation for the beauty of the world.

10

Haihao Liu

12W

IB Mathematics Higher Level

Candidate No.: 2631

Bibliography
Clark, Noel. Derivation of the Poisson distribution (the Law of Rare Events). Physics 2150
web page for Fall, 2012. University of Colorado, 29 Aug. 2012. Web. 2 May 2013
<http://www.colorado.edu/physics/phys2150/phys2150_fa12/2012F%20POISSON%
20DOSTRIBUTION.pdf>.
Eastaway, Rob, and Jeremy Wyndham. Why do Buses Come in Threes? The hidden
mathematics of everyday life. 1998. Reprint. London: Anova, 2005. Print.
Eastaway, Rob, and Jeremy Wyndham. How Long is a Piece of String? More hidden
mathematics of everyday life. 2002. Reprint. London: Portico, 2008. Print.
Ma, Dan. Poisson as a Limiting Case of Binomial Distribution. A Blog on Probability and
Statistics. Wordpress.com, 18 Aug. 2011. Web. 2 May 2013
<http://probabilityandstats.wordpress.com/2011/08/18/poisson-as-a-limiting-case-ofbinomial-distribution/>.
Prokhorov, A.V., and contributors. Poisson theorem. Encyclopedia of Mathematics.
Encyclopedia of Mathematics, 1 Mar. 2013. Web. 4 Mar. 2013.
Stewart, Ian. Professor Stewart's Cabinet of Mathematical Curiosities. 2008. Reprint. New
York: Basic, 2009. Print.
Weisstein, Eric W. Poisson Distribution. MathWorld--A Wolfram Web Resource. Wolfram
Research, Inc., 8 May 2013. Web. 9 May 2013.
Wikipedia contributors. Poisson Distribution. Wikipedia, The Free Encyclopedia. Wikipedia,
The Free Encyclopedia, 13 Feb. 2013. Web. 14 Feb. 2013
Wikipedia contributors. Poisson Limit Theorem. Wikipedia, The Free Encyclopedia.
Wikipedia, The Free Encyclopedia, 24 Dec. 2013. Web. 14 Feb. 2013
Wikipedia contributors. Stirlings Approximation. Wikipedia, The Free Encyclopedia.
Wikipedia, The Free Encyclopedia, 1 Apr. 2013. Web. 4 Apr. 2013

11

Potrebbero piacerti anche