Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
By Ritwik Vashistha
Shivam Pandey
WEAK LAW OF LARGE NUMBERS
The weak law of large numbers (also called Khinchin's law) states that in the case of i.i.d.
random variables, the sample average converges in probability towards the expected value
Interpreting this result, the weak law states that for any nonzero margin specified, no matter
how small, with a sufficiently large sample there will be a very high probability that the
average of the observations will be close to the expected value; that is, within the margin.
IMPLICATIONS OF WEAK LAW OF LARGE
NUMBERS
Bernoulli said “ If observations of all
events be continued for the entire infinity, it
will be noticed that everything in the world
is governed by precise ratios and a
constant law of change.”
Now, you might be interested in finding out that what is the probability that after 15
days it will be a nice day given that it rained today or it was a nice day today.
WLLN is not applicable here, so how can we find a ‘pre-determined Statistical Fate’ ?
FINDING CONDITIONAL PROBABILITIES
We see that if it is rainy today then the event that it is nice two days from now is the disjoint
union of the following three events:
1)it is rainy tomorrow and nice the next day,
2) it is nice tomorrow and nice the next day,
and 3) it is snowy tomorrow and nice the next day.
The probability of the first of these events is the product of the conditional probability that it
is rainy tomorrow, given that it is rainy today, and the conditional probability that it is nice two
days from now, given that it is rainy tomorrow.
We can write this product as PRR *PRN
Thus , we have PRN(2) = PRR*PRN+PRN*PNN+PRS*PSN
PRN(2) = 0.5*0.25 + 0.5*0 + 0.25*0.25 = 0.188
This equation should remind us of a dot product of two vectors; we are
dotting the first row of P with the Second column of P.
Here our random variable is Xt ,which can take values Rainy Day ,Nice Day or Snow Day
at time period t.
The range (possible values) of the random variables in a stochastic process is called
the state space of the process.
So our State Space is {Rainy, Nice, Snowy} or {R,N,S}
MARKOV PROPERTY
A stochastic process has the Markov property if
the conditional probability distribution of future
states of the process (conditional on both past
and present states) depends only upon the
present state, not on the sequence of events that
preceded it.
MARKOV DECISION PROCESS
Basis for sequential decision making.
Return-
G (t) = R (t+1) + R (t+2) + R (t+3) + R (t+4) +………+ R (T)
Discounted Return-
G (t) = R (t+1) + ϒ * R (t+2) + ϒ² * R (t+3) +……..
Q-VALUE AND Q-LEARNING
•Q- value is simply expected return of taking an action in any given state.
Possible States
EXPLORATION V/S EXPLOITATION
Possible States
HOW MUCH CAN I LEARN?
Possible States
For fixed value of parameters,
THANK YOU
BIBLOGRAPHY
Khan Academy
Wikipedia
NPTEL
Research Papers