Sei sulla pagina 1di 10

Probabilistic System Analysis

and
Applied Probability
Lecture 1

How to deal with uncertainty in life?


The way to deal with it is using the models given by probability, creating a systematic approach.
A probability model is a mathematical description of an uncertain situation.

What does it take to set a probabilistic model?

• Sample space : a description of all the things that may happen, in a precise context;
If we do an experiment, like flipping a coin, rolling a die and so on.
We come up with a the list of all the possible things that may happen during this experiment,
in other words all the possible outcomes.
At this point, formally speaking, we define our list as a set.

This list, or as we said the set, should be:


1. Mutually exclusive: so that when the experiment is carried out, there is a unique outcome.
For example, the sample space associated with the roll of a die cannot contain “1 or 3” as a
possible outcome and also “1 or 4” as another possible outcome, because we would not be
able to assign a unique outcome when the roll is a 1.
2. Collectively exhaustive: means that no matter what happen during the experiment, the
outcome of the experiment should be an element of the sample space;
Every probabilistic model involves an underlying process, called the experiment, that will
produce exactly one out of several possible outcomes. The set of all possible outcomes is
called the sample space of the experiment, and is denoted by . A subset of the sample space,
that is, a collection of possible outcomes, is called an event. There is no restriction on what
constitutes an experiment. For example, it could be a single toss of a coin, or three tosses, or
an infinite sequence of tosses. However, it is important to note that in our formulation of a
probabilistic model, there is only one experiment. So, three tosses of a coin constitute a single
experiment, rather than three experiments. The sample space of an experiment may consist
of a finite or an infinite number of possible outcomes. Finite sample spaces are conceptually
and mathematically simpler. Still, sample spaces with an infinite number of elements are
quite common. For an example, consider throwing a dart on a square target and viewing the
point of impact as the outcome.
• Probability Law: which describes how much is likely to occur a particular event compared to
the other ones;

But how much detail do we need to setup a sample space?


The aim of getting a sample space that describes exhaustively our sample space is driven by the
assumption that our elements should be unrelated from elements that give no information to
our probabilistic model representation of the phenomena;
On the other hand if we observe the concept from another point of view, we may collect all the
information that allow us to describe the phenomena with more accuracy, better saying that we
should take the elements that incorporate a bond with the other ones. Substantially, if we
define a sample space we should look for the elements that characterize the phenomena
without getting elements in which we aren’t interested for, because the resulting model may be
ineffective.

For example giving a flipping coin experiment, we can introduce two types of sample space:

The first one, consist in {H,T};


The second one, consist into {H, T+raining, T+not raining};
In this context, the choice of which sample space you should take is given by the belief we
have about the dependency correlation of the die outcome with the weather.

The issue underneath this case is universal, because when you’re dealing with a model representing a
certain situation you may obtain zillions of details describing it. The next step consist into getting a set
of variables that you might keep, ignoring all the other ones, based on the fact that the ignored ones
should have a poor influence on the outcome likelihood.
Probability Laws

Discrete Case (Finite Sample Space)


Example with rolling a four-face die twice (we will consider the two rolls as one total
experiment of the same type of process):
Firstly, if we roll the die and the result is 2, and the second one is 3, we have an outcome1 (2,3);
in the case the outcome is (3,2), it is not equal to the first total outcome.
In this case, to represent visually the experiment made up by a sequence of the same type of
process is useful to use a diagram (tree diagram) that shows how this stages are involved. We
can observe that a composition of two results is associated with a particular path in the tree
diagram.

In this case, the number of possible outcomes is 16.

1
Notice that we used the word ”outcome” for the overall experiment. Instead we use to refer to the single experiment (stage
of the overall experiment) we use the word “result”.
Continuous Case (Infinite Cardinality Sample Space)

In this case considering a point as an outcome (real numbers) , there’s an infinite set of possible
outcomes.

These are some cases about the possible types of sample space that occurs in probability.

The next stage is to look all the possible outcomes and make some statements about their relative
likelihood, which outcome is more likely to occur compared to the others?
We assign probabilities to the outcomes. But sometimes this approach is not significant.

For example, in the previous case of an infinite cardinality sample space, the probability of
getting an exact point with a casual process of selection is intuitively 0. And the consequence
for associating a 0 probability to each point inside the rectangle does not give an exact metric
for evaluating the likelihood.
Instead go further this problem, we will assign probabilities to subsets of the sample space
instead of use elements of our set (our list).

We introduce the notion of event, as a subset of sample space.


If occurs an outcome (specific point, or element) inside a subset of A ( , the total sample space),
this will mean that the event occurred, otherwise the event A did not.

How to assign probabilities to subsets corresponding to events?

Probabilities are meant to describe our beliefs about which sets are more likely to occur vs
other sets, so there’s many ways to assign these probabilities.
But there’s also ground rules to define first.

We introduce the convention of normalization of probability, which means that is a number


defined between 0 (this means that is certain that something in not going to happen) and 1
(we’re essentially certain that something is going to happen).
We also define axioms of probability, in other words the ground rules that any legitimate
probabilistic model should obey, we can use diverse probabilities but all of them obey to these
axioms (due to the possibility of doing calculation and reasoning).

Probability Axioms
o First, probability should be non-negative (Nonnegativity).

The second one is interesting, but needs a preliminary notion of treating multiple sets, giving an
intuitive notion of what happens when we try to determine the probability of combination of events
and how it’s expressed in a mathematical language:

Having two sets (events), A and B, the intersection of them,


consist into the elements that belongs to A and B (a) and that,
in a probabilistic language, means that A occurred and B
occurred. Otherwise the union of them is the collection of all
elements that belong either to the first set, or to the second, or
to both of them (b).

o Second, if we have two events with no common elements (disjoint subsets), the total
probability of them is the sum of the individual probabilities of each subset. It applies also in
the case of a sequence of countable disjoint events, such that exist a biunique correspondence
between a set of indexes of our sets and the integers. (Additivity)

o Third, the probability of the entire sample space is equal to one. Due to the collectively
exhaustive property of the total sample space , any possible outcome that occurs belongs to
this one, so that the probability of getting an element inside is 1. In other words no matter
what the outcome is, is certain that the event is going to occur. (Normalization).
Notation for the probability axioms

Are these axioms enough for what we want to do?


We define that we want the probability to be a number between 0 and 1. Do we need another axiom to
define that tells us that probabilities are less or equal than one?
We can derive it from the previous ones (so it’s not a necessary axiom, it’s a lemma):

How about the union of three disjoint sets (A, B, C)?


, by its commutative property (so there’s no priority in calculation,
it’s possible to use also a different order in calculation), so, calling as another set (D),
respecting the disjoint property, is used the additivity axiom to admit:
=
So from the axiom valid for two subsets, is possible to admit that the probability of
{ finite}, we have that the probability of their union is the sum of the probabilities
of individual sets.

How is determined the probability of a finite elements (outcomes) set?

Discrete Probability Law


If the sample space consist of a finite number of possible outcomes, then the probability law is
specified by the probabilities of the events that consist of a single element.
The simple consequence is that the total probability of the considered subset is going to be the sum of
the probabilities of each element set. Formally, for a finite subset of k element sets:

Note: It is possible to encounter a sample space that is not possible to imagine, the consequence is that there’s no consistent
way to assign probability to them according to the axioms.

In the case each element is equally likely to occur (equiprobability), then is adopted the next probability
law:

Discrete Uniform Probability Law

Let all the outcomes be equally likely, given a subset , such that then:

In this case the estimation of the probability of an event to occur is associated with counting, pretty
simple in some cases, but it’s going to be discussed later.
How is determined the probability of a infinite elements (outcomes) set?

Continuous Models

Probabilistic models with continuous sample spaces differ from discrete counterpart in that the
probabilities of the single-element events may be not sufficient to characterize the probability law.
Lecture 2

Reviewing the second axiom (additivity one), it’s possible to associate a non-negative probability value
to a point inside the unit square such that the other axioms don’t fall?
In this case, firstly, we should notice that the sample space is infinite uncountable.
If we assume that the union of all the set made up by a point is equal to the unit square, then
associating the area of the square to measure probability, we obtain the next result:

This demonstrates that the additivity axiom applies to the case in which we have a countable sequence
of sets, taking their union. In absence of a biunique correspondence between real number and integer numbers,
it is impossible to determine a sequence made up by uncountable numbers such that the additivity axiom applies.
In continuous models a 0 probability means that the event its extremely likely to not occur, otherwise
the probability of 1 (in some cases) is extremely likely to occur, not necessarily for certain.

Conditional Probability
Introducing the conditional probability, trying to express the probability estimate when it’s given some
partial information about the phenomena. The starting point for the definition of conditional
probability is a chance experiment for which a sample space and a probability measure P are defined.

Given some initial beliefs about the outcome for each event (assigned probabilities in the upper figure),
we’re told that event B occurred. The probability P(B) reflects our knowledge of the occurrence[FF1] of
event B before the experiment takes place. Therefore the probability P(B) is sometimes referred to as
the a priori probability of B or the unconditional probability of B. This means that the outcome is
going to lie inside the set B. In this case you should change your belief about what’s likely to occur to
happen and what not is denoted by the following notation:

This is the conditional probability that the event A is going to occur (the probability that the outcome
is going to fall inside the set A) given that the event B happened (so the outcome lies in the subset B).
Now we should focus on B like our “new” sample space, because we can make more precise evaluation
of the probability value about the outcomes that have to face
The consequences are that our beliefs about the probabilities of A and B changed:
, that stand for “the probability of getting an outcome inside B, given that B
occurred, is 1;
;
What’s the likelihood of A given that B occurred (the second probability measure)?
Firstly, the probability of was twice as likely the probability of , so we’re going to keep
the same proportions. And the probabilities should adapt to 1 (their sum is 1), then we get 2/3
probability of being in and 1/3 probability of being in .
Using the definition:
Saying that the probability of getting an outcome that is inside A, given that B happened, is equal to
their intersection probability value divided by the total probability of B to occur.

An alternative expression of the same rule has a nice interpretation:


;
From a frequentist point of view, so thinking to probabilities as frequencies, if a do the experiment
over and over, what fraction of the time is going to be the case that both A and B occur?

There’s going to be a fraction of the time at which B occurs and out of the times when B occurs
there’s going to be a further fraction of the experiment in which A occurs. You only look at those
experiment at which B happens to occur and look at what fraction of those experiment where B
already occurred also A occurs.
There’s also the symmetrical equality of the previous alternative form of the conditional rule:
;

Conditional probabilities are like the already seen ordinary probabilities, they obey the axioms.
The formal problem is that we get some partial information about an outcome that is going to occur,
we focus on the event that occurred (our partial information) to estimate a better value of probability
of the possible outcome, keeping the ratio initially formulated.
An example of how our conditional probability is still obeying to the axioms is:
s.t.

Exercise:
Let B be an event : , orange event;
Let ;

1. = 0;
2.

It’s possible to determine the probability measure of the second


expression quicker assuming that the proportions between outcomes
unchanged (because of our uniform distribution), saying that
one outcome s.t. M=2 is inside our “new” sample space.

Potrebbero piacerti anche