Sei sulla pagina 1di 6

Leong YK & Wong WY Introduction to Statistical Decision

Chapter 0
Introduction to Statistical Decision
The excitement that a gambler feels when making a bet is equal to the
amount he might win times the probability of winning it.
Blaise Pascal

The only useful function of a statistician is to make predictions, and thus


to provide a basis for action.
W.E. Deming

0.1 Introduction
The following are examples of decisions and of theoretical problems that
they give rise to.
Shall I bring the umbrella today?
The decision depends on something which I do not know, namely whether it will rain or
not.
I am looking for a car to buy. Shall I buy this one?
This car looks fine, but perhaps I will find a still better car for the same price if I go on
searching. When shall I stop the search procedure?
The court has to decide whether the defendant is guilty or not.
There are two mistakes that the court can make, namely to convict an innocent person
and to acquit a guilty person. What principles should the court apply if it considers the
first of this mistakes to be more serious than the second?
A committee has to make a decision, but its members have
different opinions.
What rules should they use to ensure that they can reach a conclusion even if they are in
disagreement? Almost everything that a human being does involves decisions.
Therefore, to theorize about decisions is almost the same as to theorize about human
A drug company is deciding whether or not to sell a new pain reliever.
The question is how much of the drug to produce. However, decision-making processes
usually involve uncertainty. In this case, the proportion of the population that will buy
this pain reliever is not known. If we over-produce, there will be an excess of drugs
which cannot be sold; if we under produce, we cannot maximize our profit. This is the

Leong YK & Wong WY Introduction to Statistical Decision

potential loss for the company. The main goal of this study is to find an action or
decision rule that reduces our loss as much as possible.

Abraham Wald (1902 ~ 1950 )

The goal of this course is to give an overview of fundamental ideas and results about
rational decision making under uncertainty, highlighting the implications of these results
for the philosophy and practice of statistics.
Decision theory as the name would imply is concerned with the process of making
decisions and was introduced by Abraham Wald. Unlike classical statistics which is
only directed towards the use of sampling information in making inferences about
unknown numerical quantities, an attempt in decision theory is made to combine the
sampling information with knowledge of the consequences of our decisions.
The extension to statistical decision theory includes decision making in the presence of
statistical knowledge which provides some information where there is uncertainty. The
elements of decision theory are quite logical and even perhaps intuitive. The classical
approach to decision theory facilitates the use of sample information in making
inferences about the unknown quantities. Other relevant information includes that of the
possible consequences which is quantified by loss and the prior information which
arises from statistical investigation. The use of Bayesian analysis in statistical decision
theory is natural. Their unification provides a foundational framework for building and
solving decision problems. That is in decision theory an attempt is made to combine the
sample information with loss function and prior information. The basic ideas of decision
theory and of decision theoretic methods lend themselves to a variety of applications
and computational and analytic advances.

Leong YK & Wong WY Introduction to Statistical Decision

0.2 Basic Elements


To work with the problem mathematically, it is necessary to employ some notation.
The unknown quantity which affects the decision process is called the state of nature
and commonly denoted . The parameter space, which contains all the possible values
of , is denoted . We also let a denote the action we take and let A denote the set
of all of the possible actions.

Loss function
In order to make the right decision, we need to understand the consequences of taking
an action under the uncertainty. This information is summarized in a loss function.
Definition 0.2.1
The loss function, L : A E , represents the loss when an action a is employed
and turns out to be the true nature of state.
We express the consequences in term of loss. If the consequence of an action is a
reward, we multiply the value of the reward by 1. Therefore, maximizing rewards
becomes minimizing the loss.
Example 0.2.1
Returning to the drug example, let be the proportion of the population that people will
buy this drug, thus = [ 0 , 1] . The action in this problem would be the estimate of .
Hence, a [ 0 , 1] = A . The company defines the loss function as
, a
a
L( , a ) =
2(a ) , < a
In other words, the company considers that over-estimation of the demand will be
twice as costly as under-estimation. This kind of loss function is called weighted linear
loss.
The main goal of the decision making is to find an action which incurs the least
loss. A decision-making is said to be a no-data problem when there is no data available.
When data are available, we shall make use of the information from the data to choose
suitable action.

Leong YK & Wong WY Introduction to Statistical Decision

Definition 0.2.2
A decision function d is a statistic that takes value in A. That is d is a function that maps
E n in to A. The class of all decision function that maps E n in to A is the decision space
denoted by D. Note that, if X = x is observed, then we take an action d ( X ) A .
Remark : In non-data problems the decision space and action space are same.
These things are always easier to think about with an example in mind.
So, lets say I offer you a bet. I have two coins: one biased towards heads (7030 heads) and one even more biased towards tails (80-20 tails) . Ill randomly
choose one of the coins (not telling you which one), flip a few times, then let
you make a bet on whether the coin was biased towards heads or tails. You
can take heads or tails, but the odds are different:
you bet Heads (H)(70~30)

you bet Tails (T) (20~80)

Coin is heads
(70~30)

you win RM1.00

you lose RM1.00

Coin is tails
(20~80)

you lose RM10.00

you win RM10.00

How do you decide whether to say heads (H) or tails (T), given the number
of heads and tails that came up? I guess this depends on how lucky you feel...
Decision theory says, forget luck, look at the expected loss. First we need a loss, or
cost function. This function quantifies how badly it hurts if we guess H when we
should have guessed T, and vice versa.
Now we want to compute the expected loss of whatever decision strategy we can think
of, and then choose the strategy that minimizes the expected loss. Lets list our
ingredients:

A decision strategy is any possible function of the data guess ( X ) A= {H, T}.
(This is just a fancy way of saying, you see some data and make a choice,
guess( X ), between the possible states of the coin, where the action space the
set of actions, that make any sense is A = {H, T}.)

The variable X is number of heads and tails observed.

Leong YK & Wong WY Introduction to Statistical Decision

We need the likelihood of having observed X , given that the coin was actually H
or T. This is given by our trusty binomial formula. Lets say I flipped, and once
the number of observed heads is x . Then

P( X = x | coin biased towards heads) = (0.7) x (0.3)1 x , x = 0 , 1;


and

P( X = x | coin biased towards tails) = (0.2) x (0.8)1 x

x = 0 , 1.

We already defined our loss function. Here losses are tabulated as


H (head )
1
10

Biased towards head


Biased towards Tail

T (tail )
1
10

Finally, we have one last parameter that plays an important role: the probability that I
chose the heads coin at the very beginning. Call this parameter . Let denote the
possible values of . In our case = { 0.7 , 0.2 } . Now the above loss table can be
expressed in terms of the parameter and actions as
H (head )
1
10

= 0.7
= 0.2

T (tail )
1
10

There are four possible decisions, namely

x=0
x =1

d1
H
H

d2
H
T

d3
T
H

d4
T
T

So lets put everything together. Well compute the expected loss as a function of , aka
the risk function. The risk function of the decisions are
R(0.7, d1 ) = 1 P ( X = 0 | = 0.7) 1 P ( X = 1 | = 0.7) = 1
R(0.2, d1 ) = 10 P( X = 0 | = 0.2) + 10 P ( X = 1 | = 0.2) = 10
R(0.7, d 2 ) = 1 P ( X = 0 | = 0.7) + 1 P ( X = 1 | = 0.7) = 0.4
R(0.2, d 2 ) = 10 P ( X = 0 | = 0.2) 10 P ( X = 1 | = 0.2) = 6

Leong YK & Wong WY Introduction to Statistical Decision

R(0.7, d 3 ) = 1 P ( X = 0 | = 0.7) 1 P( X = 1 | = 0.7) = 4


R(0.2, d 3 ) = 10 P( X = 0 | = 0.2) + 10 P ( X = 1 | 0.2) = 6
R(0.7, d 4 ) = 1 P ( X = 0 | = 0.7) + 1 P ( X = 1 | = 0.7) = 1
R(0.2, d 4 ) = 10 P ( X = 0 | = 0.2) 10 P ( X = 1 | = 0.2) = 10
Which decision (d i ) should you choose?

Potrebbero piacerti anche