Sei sulla pagina 1di 116

In the Name of ALLAH, the beneficent, the Merciful,

O Allah, send your salutations upon Muhammad (PBUH) & on the Family
of Muhammad (PBUH) as you sent your salutations upon Ibrahim & on the
Family of Ibrahim verily you are Most Praiseworthy & Glorious

Quantitative Methods for


Decision Making
A Practical and Philosophical approach
By,
Yaseen Ahmed Meenai
Faculty, FCS-IBA
ymeenai@iba.edu.pk

What is Statistics (A science or an art?)


An activity of obtaining data and then;
Compiling, summarizing, presenting, analyzing,
interpreting and.
Drawing conclusions, is called Statistics.
In short it is;
Data Process Information/Conclusions
Statistics is sort of a mixture of science and art,
till process it is a SCIENCE and drawing
conclusions is an individuals ART.

What is DATA (A word or a Keyword?)


DATA is a group of raw fact and figures which
may VARY from;
Person to Person, Object to Object, Distance
to Distance and Time to Time.
Only the absence of VARIATION can cause a
CONSTANT and it doesnt exists in our physical
world. Only spiritualism can define a
CONSTANT.

Data v/s Variable


Variable is the storage of data, its being represented by letters X,Y,Z etc.
There are two types of variables:
Qualitative Variable: It deals with the data which may vary by it kind,
which provides labels, or names, for categories of like items, i.e. a set of
observations where any single observation is a word or code that
represents a class or category.
Gender, Complexion, Weather, Type are some examples
Quantitative Variable: It deals with the numeric data, which measures
either how much or how many of something, i.e. a set of observations
where any single observation is a number that represents an amount or a
count.
Age, Height, number, price are some examples of Quantitative variable.
Source: http://www.microbiologybytes.com/maths/1011-17.html

Inactivity breaker
Object: Allocate a blank page from your writing material and divide that page into
two columns in the following manner:
Qualitative Variables

Quantitative Variables

1- Gender

1- Age

2- Complexion

2- Height

3- Qualification

3- Weight

4- Weather

4- Price

20.

20.

Try to write atleast 20 variables in each column by observing several fields like
management, agriculture, medical, engineering, geology etc. Submit the same
sheet by writing your full name on the top.

Data Sources
There are three major sources of data:
1. Survey/Census: An official, usually periodic
enumeration of a population, often including
the collection of related demographic
information, is called census. Survey means to
inspect and determine the conditions of
interest.
2. Experiment: Any activity, which is usually being
conducted within an isolated atmosphere, and
produces results, is called experiment.
3. Simulation: An artificial way of data collection.

Data Collection/compilation
Teaching Ranks where 1-Very Poor, 5-Excellent
4.5
3.7
4.3
3.3
2.7
4.7
3.8
4.5
3.4
4.0
3.8
2.7
4.3
3.4
3.2
3.7
3.9
3.8
3.8
3.7
3.6
5.0
4.2
4.1
4.2
4.1
3.9
4.5
5.0
3.7
4.8
3.2
4.2
4.5
4.2
5.0
2.9
Data collection/compilation is needed for getting
actual behavior of the variable.
Note: The above data is simulated version of the actual.

Data Tabulation (Grouping Exercise)


Step # 01: Finding the range
Range = Max. Min

= 5.0-2.7 =2.3

Step # 02: Finding the number of classes


No. of classes = 1 + 3.3 log(n) = 1+3.3 log(37)

= 6.175

Step # 03: Finding the width or height (h)


h = Range/No. of classes= 2.3/6.175

= 0.377 0.4

Class Interval:

One of the intervals into which the range of a variable


of a distribution is divided, esp. one of the divisions
of the base line of a bar chart or histogram.
After forming the structure of Class-Intervals and frequencies by using
methods of tally-marks, we can observe the actual behavior.

Data Process Information


Histogram

Frequency

2.7

3.1

3.4

10

3.8

4.2

12
10
Frequency

Ranks

8
6
4
2
0
2.7

4.6

3.1

3.4

3.8

4.2

4.6

Ranks

The above mentioned frequency distribution table and the Histogram are
revealing the shape of thoughts generated from the minds of students. If we
discover a subsequent Mathematical Model, it will called a Probability
distribution.

Data Process Information


Histogram

Frequency

2.7

3.1

11

3.4

3.8

4.2

12
10
Frequency

Ranks

8
6
4
2
0
2.7

4.6

3.1

3.4

3.8

4.2

4.6

Ranks

The above mentioned frequency distribution table and the Histogram are
revealing the shape of thoughts generated from the minds of students. If we
discover a subsequent Mathematical Model, it will called a Probability
distribution.

Grouping the data (MSEXCEL)

Data Analysis option is located in the Data menu, in case if it is not


present there we can activate it by running the Add-Ins present in
Excel Options.

Grouping the data (MSEXCEL) cont


After
providing
data-range and
hitting the Labels
and Chart-output
options, we can
find the histogram
either in the new
worksheet or in the
specific place of the
existing sheet.
Bin numbers These numbers represent the intervals that you want the
Histogram tool to use for measuring the input data in the data analysis.

Statistical Measures (An introduction)


The phrase descriptive statistics is used generically in place
of statistical measures.
These statistic(s) describe or summarize the qualities of
data.
Another name is summary statistics, which we mostly used
to ornament our reports/cases/research.
This would be beneficial if graphical summary is not enough
sufficient for the final conclusions.

Data

Processing

Processing

By Graph

By Measure

Conclusions

Statistical Measures (An Example)


Consider the following group data:
Class
Intervals

Frequency

24
46
68
810
1012

2
5
9
7
2
f=25

Relative
Frequency
(R.F.)
2/25 = 0.08
5/25 = 0.20
9/25 = 0.36
7/25 = 0.28
2/25 = 0.08
R.F.=1

Cumulative
Relative Frequency
(C.R.F)
0.08
0.28
0.64
0.92
1.00

The above data showing Income in 1000s of Rupees of some individuals in


late 1980s

Statistical Measures (Quartiles)


These are 3 values respectively represented by Q1, Q2
and Q3 and divides the data into 4 equal parts.
Each part contains 25% observations
Quartiles Usually highlight 4 different classes i.e.
Lower class, Lower Middle, Upper Middle and Upper
class.
25%
Lower
Class
Min

25%
Lower
Middle
Q1

25%
Upper
Middle
Q2

25%
Upper
Class
Q3

Max

Computing Quartiles
In order to computer Quartile Values, we need to
consider the same frequency distribution in addition to
the column of Cumulative Frequency.
Class
Intervals
24
46
68
810
1012

Frequency

2
5
9
7
2
f=25

Cumulative
Frequency (C.F.)
2
7
16
23
25

Computing Quartiles (Procedure)


For any group-data, quartiles can be computed by following two
simple steps:
Step-1: Finding the location of ith Quartile: (where i=1,2 and 3)

Step-2: Finding the value of ith Quartile:

Where l = lower limit of captured class, h=class-width, f=class


frequency, C.F.=previous class C.F.

Computing Quartiles (Demo)


Class
Intervals
24
46
68
810
1012

Frequency

Cumulative
Frequency (C.F.)
2
7
16
23
25

2
5
9
7
2
f=25
Step-1 (For Q1): (1 x 25) / 4 = 6.25

Step-2: Q1=4+2/5 (6.25 - 2) = 5.7


Note: Class width=h=2

1st
Quartile
Class

Quartiles (Income Classes)


25%
Lower
Class

25%
Lower
Middle

25%
Upper
Middle

25%
Upper
Class

Min

Q1

Q2

Q3

Max

2000

5700

7222

8786

12000

Quartiles can be computed using MSEXCEL, ungroup


form of data is needed there, the syntax is given below:
=QUARTILE(Data Range,i) where i=1,2,3 showing
quartile numbers.

Exploratory Data Analysis (EDA) by Sir


John Wilder Tukey
There are two types of studies:
Hypothetical Study
Exploratory Study
In Exploratory study, we can perform our
analysis
by
avoiding
conventional
methodologies. In EDA, we can observe the
trend of data by applying different processes
on the data.
The Box-plot is a very useful part of EDA.

The Box-Plot
Boxplot of Teaching

Inter-quartile Range=Q3-Q1

Min

Q1
3

Q2

Q3
4

Teaching Ranks

Max
5

Exploratory Analysis for Quality ranks


from Aventis Field Managers
Boxplots of Teaching, Administration & Structure
(means are indicated by solid circles)
5

Structur

Admin

Teaching

Statistical Measures (Central Tendency)

(Mean, Median and Mode)


The main problem associated with the mean
value of some data is that it is sensitive to
outliers.
The median is simply the middle value
among some scores of a variable. Its the 2nd
Quartile (Q2) of any data.
The most frequent response or value for a
variable. Multiple modes are possible:
bimodal or multimodal.

Mean, Median and Mode


Measurements are on x-axis and frequencies are on y-axis

The Mode is based on the principal of democracy, while


median (Q2) follows the rule of moderation. Mean took its
place after being influenced by the higher values of
measurements. The above mentioned distribution is +vely
skewed.

Mean and Mode (Computations)


Modal
Class

Class
Intervals
24

Frequency
fi
2

46

f i xi

Mid-Points
xi
(2+4)/2

=3

23

f1=5

(4+6)/2

=5

55

68

fm=9

(6+8)/2

=7

97

810

f2=7

(8+10)/2

=9

79

1012

(10+12)/2

=11

211
f i xi=179

fi=25

Mode

f m f1
h
l
2 f m f1 f 2

7.333

Mean

fx
f

i i
i

179
7.160
25

AverageIncomeof the
Majority' s Incomeis 7333 Rs /

communityis 7160Rs /

Empirical relationship b/w


Mean, Median and Mode
Following are the values for Mean, Median
and Mode obtained from the Income data:
fx

Mean
f

i i
i

179

25

7.160

Median Q2

7.222

f m f1
h
Mode l
2 f m f1 f 2

7.333

Mean Median Mode (Thus the data is slightly vely skewed )

Mean, Median and Mode


MSEXCEL syntaxes for finding three measures
of central tendency are;
=Average(Data Range)
=Quartile(Data Range,2)
=Mode(Data Range)

For Mean
For Median
For Mode

Statistical Measures (Dispersion)


What is DISPERSION??
A dart-game can help us in this
Based on the visual observation; we
can declare Player-A as a winner
because:
Player A is,
More consistent/Less
Variable/Homogenous/Less Dispersed
And
Player B is,
Less Consistent/More
Variable/Heterogeneous/More
dispersed

Measures of Dispersion
Some Important Measures of Dispersion are:
Range=Max-Min
Variance
Standard Deviation
Mean Deviation
Inter-quartile Range
Coefficient of Variation (C.V.)

Dispersion Measures (Cont)


2
xi x
Variance V ( X )

Variance of the following


ungroup data:
X: 1,2,3,4,5
Mean=3
V(X)=[(1-3)2+(2-3)2+(33)2+(4-3)2+(5-3)2]/5= 2
Standard Deviation=
=1.414 ???

V (X )

Coefficient of Variation (Consistency Check)


In order to check whether the variable is
consistent or not, we need to computer the
coefficient of variation,

V (X )

C.V .
100 100
X

For any consistent variable, C.V. < 100%


C.V. is the unit-less measure of dispersion.

Variance & Standard deviation (group-data)


f i xi

f i (xi-mean)2

(2+4)/2=3

23

2(3 - 7.16)2=34.61

(4+6)/2=5

55

5(5 - 7.16)2=23.33

68

(6+8)/2=7

97

9(7 - 7.16)2=0.230

810

(8+10)/2=9

79

7(9 - 7.16)2=23.69

1012

(10+12)/2=11

211
f i xi=179

2(11 - 7.16)2=29.49

Class
Intervals

Frequency
fi

24

46

Mid-Points
xi

fi=25

f x x

Variance V ( X )
f

=111.34

111.34

4.45
25

Variable Comparison (Property of C.V.)


Coefficient of Variation for 1,2,3,4,5 is,
V (X )
1.414
C.V .
100
100 47 .1%
X
3
And for the Income-data; it is,

V (X )
2.111
C.V .
100
100 29 .48 %
X
7.16
So technically, Income data is more consistent
than the first five natural numbers.

Hand-Profile Analysis
(An exploratory approach)
X4

X3

S.No.

X2

X5
Span (X6)

Length
(X7)

Thumb
(X1) in
cms

Measurements (X)

X1

X2

X3

X4

X5

X6

X7

Determine
the Mean,
Standard deviation and
Coefficient of Variation.

Moments Based Skewness


Moments Based Skewness Measure:
This will always gives us a +ve or a zero value

For symmetric distribution, this measure will be


zero as 3=0.
For any skewed distribution; 1 will be having a
+ve value and the sign of 3 will indicate the
direction of skewness i.e. for any -ve skewed
distrbution; 3 will be -ve and vice versa for +vely
skewed asymmetric distribution.

Moments Based Kurtosis


What is Kurtosis??

For Laptokurtic state


(Less Dispersed)
For Mesokurtic state
(Normally Dispersed)
For Platykurtic state
(More dispersed)

2>3
2=3
2<3

Approximate Confidence Interval


For any Bell-shaped symmetrical distribution;
the following will be proved:
1) will cover approximately 68% observations
2) 2 will cover approximately 95% observations
3) 3 will cover approximately 99.98% observations
Where and are the mean and standard deviation
respectively.

Why Bell-Shaped Symmetrical


Distribution??
In a Bell-shaped distribution, extreme values come
with less frequency.
Majority falls within one standard deviation.
Its Natures Distribution. God created almost all
natural measures with a bell-shaped distribution.

Empirical Proof for the Approx.


Confidence Intervals
Bring One Neem Leaf and measure its length in
cms.
Obtain Mean and Standard Deviation
Empirically prove the following theorems:
1) will cover approximately 68% observations
2) 2 will cover approximately 95% observations
3) 3 will cover approximately 99.98% observations

H.W. (Group the data and prove that its Bellshaped symmetric in nature)

Statistical Process Control (SPC)


The Concept is based on Approximate
Confidence intervals.
Its usually use to monitor a manufacturing
Process
or
to
observe
individuals
performance.
For this purpose, we setup a graph which is
called a Control Chart.
Control Chart is bounded by two Control
Limits.

A Control Chart
Upper control Limit

Theoretical /Claimed
Value

+3
A Realization

- 3
Lower control Limit

By observing any realization; we can monitor any process


which can alert us on two conditions:
1- Either any observation crosses or even touch any prealarm control limit. Or
2- When the realization motion become rhythmic

Statistical Process Control (An activity)


Consider the following Manufacturing
Process;
X=2 x Ran#
Simulate 7 Observations using this
simulator.
Obtain a Control Chart using these
parameter values; =1, =0.3.
Deduce whether your process is
under control or not. Comments on
your Realization.

S.No. X=2xRan#
1

2xRan#

2xRan#

2xRan#

2xRan#

2xRan#

2xRan#

2xRan#

A Class Activity
Write your name on the Top of todays Class
Work
Keep your Class Work open on your desk.
Leave your seat and check atleast one of your
classmate copy and write your remarks about
him/her on a chit.
Submit your remarks-chit to me by writing the
name of that classmate.

Introduction to Probability
It is the science in which either we study a
random experiment or we observe a random
phenomenon.
In probability study, a sample space is needed
which is the set of all possible outcomes of
any random experiment.
It is the connectivity b/w Descriptive and
Inferential Statistics.

Logical Thinking motivation


Drawing a FISH can help us understand the logical
thinking:

Now, try to re-draw the same fish, but without


lifting your pen once it touches the paper and
without striking out any of your drew line.

Logical Thinking through the Venn


diagram
A Venn diagram is a rectangular area showing
the Sample Space & having some circles inside
(usually overlapped) which are showing the
Events.
S
B
A
c
S={a,b,c,d,.,n}
d,e
a,b
g,h
A={a,b,c,f,g,h}
i
f
B={c,d,e,g,h,i}
J,k
l,m,n
C={f,g,h,I,j,k}
C

Shading the Venn Diagram


S
B

For
should
For AB
AB,
it, itshould
AB,
A
, it should
bebebe
For
AB,
The Demorgans Law

Probability Topics Tree


Random
Experiment

Expectation

Probability
Distribution

Random
Variable

Criteria

Numeric

Counting Rules
Sample
Space
Events

Outcomes

Mutually Exclusive (Non Overlapping)

Non Mutually Exclusive (Overlapping)


Probability
Independent
P(AB)=P(A) P(B)

Dependent
Conditional Probability

What is the Distribution?


Gives us a picture of
the variability
and central tendency.

Can also show the


amount of skewness
and Kurtosis.

Bell-Shaped Symmetrical Distribution

Central Tendency

Dispersion

2
3

Probability Distributions
For any frequency distribution, we need a
variable while for any probability distribution,
we need a random variable
Random Variable is the data which can be
obtained by converting the outcomes of any
sample space into numeric codes after defining
a particular criteria, so;
Random Experiment is necessary for a
probability distribution

Any Experiment with uncertain results


(outcomes) called a random experiment
For example, mixing acid and base will
produce salt and water (Its an experiment)
but;
Tossing a Dice or a Coin, or Drawing a card
from well shuffled deck will produce a random
result (these are examples of random
experiments), so in each random experiment,
we collect all possibilities (outcomes) and
make a sample space

Formation of Sample Spaces


Random Experiments Related to a Fair Coin:
Random Experiment # 1: Tossing a fair-coin once
S={H,T}
21=2 outcomes
Random Experiment # 2: Tossing a fair coin twice or tossing 2
fair coins, once.
S={HH, HT, TH, TT}
22=4 outcomes
Random Experiment # 3: Tossing a fair coin thrice or tossing 3
fair coins, once.
S={HHH, HHT, HTH, THH, THT, TTH, HTT, TTT} 23=8 outcomes
In general, 2n showing the two sided coin is being tossed n times

Formation of Dichotomous SS
A truth Table can help us forming the sample
space: For e.g. Sample Space of Rand. Exp. # 3.
The formation rule is simple S. No. 1 2 3
1
H
H
H
Values of Every next column
2
T
H
H
should be doubled of the
3
H
T
H
preceding column.
4
T
T
H
5
H
H
T
Outcomes can be observed
6
T
H
T
Horizontally.
7
H
T
T
st

nd

rd

Random Experiments with Dice


Random Experiment #4: Tossing a fair dice, once
S={1,2,3,4,5,6} 61=6 outcomes
Random Experiment #5: Tossing a fair dice, twice or
Tossing two fair dice once
S={11, 12, 13, 14, 15,16
21, 22, 23, ,26
.....
61, 62, 63, ., 66}
62=36 outcomes

Random Experiments Contd..


Random Experiment #6: Tossing a fair coin and a fair
dice, once
S={H1,H2,H3,H4,H5,H6,T1,T2,.T6} 21 x61=12 outcomes
Random Experiment #7: Tossing 2 fair coins & a fair dice
once.
S={HH1,HH2,HH3,HH4,HH5,HH6
HT1,HT2,HT3,,HT6
..
TT1,TT2,.,TT6}
22x61=24 outcomes

Random Experiments A Deck of Cards


Random Experiment #8: Drawing a card from a
well shuffled Deck of playing cards.
S={ Hearts
Diamonds
Clubs
Spades

King+Queen+Jack+Ace+2+3++10
King+Queen+Jack+Ace+2+3++10
King+Queen+Jack+Ace+2+3++10
King+Queen+Jack+Ace+2+3++10}
Total=

13
13
13
13
52

Formation of Events
What is an Event?

Replicate the same work for


Random Experiment #3

Its a logical statement which should be followed, strictly


We always collect the matching outcomes from the sample
space after viewing the Event statement.
VENN Diagram
For e.g. if we consider the Random Exp. # 2:
B
A
Object: Tossing a fair coin twice, S={HH,HT,TH,TT}
TH
HH HT
Event(s):
A={First toss should be a Head}
TT
A={HH, HT}
B={Exactly one Tail in the outcome}: B={HT,TH}
Thus we formed two Non-Mutually Exclusive Events

Computing Probability
Probability of an Event
P(A) stands for probability of an Event A such that;
P(A) = n(A)/n(S)
Where,
n(A) is the number of outcomes present in Event A.
n(S) are the number of outcomes present in the
Sample Space.
Probability is a proportion of Event in a Sample Space.
For any Event A; 0 P(A) 1 where A S

Computing Probabilities (Example)


Random Experiment # 2: Tossing a fair coin twice or
tossing two fair coins, once.
Sample Space
S={HH,HT,TH,TT},
Event(s)
A={First toss should be a Head},
A={HH, HT}
B={Exactly one Tail in the outcome}: B={HT, TH}
Therefore Probabilities will be,
P(A)=2/4=0.5
50% chances
P(B)=2/4=0.5
50% chances

Interpreting Probability
Probability occurs against every Event and should be interpreted
in 3 components;
1) Object of the Random Experiment
2) Value of the Probability
3) Event Statement

For e.g., Interpretation of P(A)=0.5 can be written as;

If we toss a fair coin twice, we have 50% chances


of getting head in the first toss.
Similarly, P(B)=0.5 would be:

If we toss a fair coin twice, we have 50% chances


of getting exactly one tail in both tosses.

Union, Intersection and Compliment


For the same Random Experiment # 2, the following
operations showing results and relevant interpretations
needed (where U=OR, =AND, A=not(A):
Since
S={HH,HT,TH,TT}
A={HH,HT} B={HT,TH}
Therefore,
AUB={HH,HT,TH}
P(AUB)=3/4=0.75 75%
If we toss a fair coin twice, we have 75% chances of getting
head in the first toss OR exactly one Tail in both tosses.
AB={HT}
P(AB)=1/4=0.25 25%
A=S-A={TH,TT}
P(A)=2/4=0.50
50%
P(A)=1-P(A)

Practice Questions
Q1) If we toss a fair coin three times, determine the
following probabilities:
a) P(A)=Probability of getting exactly one Head in all tosses?
b) P(B)=Probability of getting Tail in the first toss?
c) P(C)=Probability of getting exactly one head AND one
tail?
P(One head One Tail)
d) P(D)=Probability of NOT getting exactly one head in all
tosses? P(A)
e) P(F)=Probability of Either getting exactly one head in all
tosses OR tail in the first toss?
P(AUB)

Practice Questions (Contd..)


Q2) If we toss a fair dice twice, determine the following
Probabilities: (Ref. Random Experiment #4)
a) P(A)=Probability of getting same number on both Dice?
b) P(B)=Probability of getting odd number in both Dice?
c) P(C)=Probability of getting sum of both numbers equals
to 5?
d) P(D)=Probability of getting an odd number AND an even
number on two Dice respectively.
e) P(F)=Probability of NOT getting the same number on
both Dice?

Practice Questions (Contd..)


Q3) If we toss a fair COIN and a Fair DICE once, determine
the following Probabilities: (Ref. Random Experiment #6)
a) P(A)=Probability of getting exactly One head in the coin?
b) P(B)=Probability of getting an odd number on Dice?
c) P(C)=Probability of getting exactly one Head with an Odd
number on Dice? P(AB)
d) P(D)=Probability of getting a number less than 4 on Dice.
e) P(F)=Probability of NOT getting exactly one Head in the
coin? P(A)=1-P(A)

Practice Questions (Contd..)


Q3) If we toss two fair COINS and a Fair DICE once,
determine the following Probabilities: (Ref. Random
Experiment #7)
a) P(A)=Probability of getting exactly One head in the coin?
b) P(B)=Probability of getting an odd number on Dice?
c) P(C)=Probability of getting exactly one Head with an Odd
number on Dice? P(AB)
d) P(D)=Probability of getting a number less than 4 on Dice.
e) P(F)=Probability of NOT getting exactly one Head in the
coin? P(A)=1-P(A)

A VENN diagram case with a Deck


If we draw one card from
the following events:
K={it will be a King}
K={4 cards}
A={it will be an Ace}
A={4 cards}
B={it will be a Black Card}
B={26 cards}
D={it will be a Diamond}
D={13 cards}

a well-shuffled deck, determine

E={it will be a card numbered


from 3 to 5}
E={12 cards}
Show a Venn diagram
containing these 5 Events

A VENN diagram case with a Deck

22
B

22

A VENN diagram case with a Deck


D
11

K
1

A
1

22
B

11

A VENN diagram case with a Deck


D
8

2
E
16

6
B
3

Deck of playing card, an example


P(BE)=6/52=0.115
Interpretation: When we draw a card from a wellshuffled deck, we have 11.5% chances of getting a
Black card which is numbered b/w 3 to 5.

Independence/Dependence
Conditional Probability
A Contingency table can help us to understand
the concept of Independent or Dependant
Events.
A contingency table is a Bivariate Frequency
table showing a joint Distribution of two
variables.
Usually two Qualitative variables can be used
to form a Contingency Table.

Contingency Table (An Example)


Consider the following table which is
representing Gender (Male/Female) and the
Eyesight Status (Glasses/No Glasses):
Gender/EyeSight

Male (M)

Female (F)

R. Total
R.Total

Glasses (G)

05 MG

12 FG

G =17
17

No Glasses (NG)

09 MNG

19 FNG

NG=28
28

Column Total

M=14
14

F=31
31

S =45
45

Conditional Probability (Example Exercise)


Q: If we select a person at random from this
community then determine that probability that the
selected person will be;
a) A Male?
Ans. P(M)= 14/45=0.31
31%
b) A Male with Glasses?
Ans. P(MG)= 05/45=0.11 11%
c) A Male given that He must be wearing Glasses?
Ans. P(M/G)=05/17=0.294 29.4%
P(M/G)=P(MG)/P(M)=(05/45)/(14/45)=0.294

Conditional Probability
(Independence Check)
If Gender is independent of Eyesight, then the
following will be proved:
P(M/G)P(M)
We considered this empirically and got this result:
0.294 0.31
Therefore we can say that Gender is Independent of
Eyesight which is quite obvious.
This might be possible to get different answers for
both simple and conditional probabilities if we
consider the case of Gender v/s Heart Disease.

Conditional Probability
(Independence Check)
We might be having different result if we make a
minor change in the following table:
Gender/EyeSight

Male (M)

Female (F)

R. Total

Glasses (G)

05

19
12

G =24
=17

No Glasses (NG)

09

12
19

NG=21
NG=28

Column Total

M=14

F=31

S =45

Now, if we can observe the dependence by observing the following result;


P(M/G)=05/24=0.208 is not equals P(M)=14/45=0.31
Hence Gender and Eyesight are dependent.

Conditional Probability
(Class Activity)
Generate the following Bivariate Data by recalling
your memories: S.No Gender (M/F) Like/Dislike
.

50

Form a bivariate frequency table and test whether Gender is


Independent of your views or not, i.e. :
P(M/L)P(M) or P(M/L)P(M)
Independent results showing no gender discrimination in MIND, vice
versa for Dependence.

Random Variables (Example)


Replicate the same for Random Exp. # 3
If we toss a fair coin twice {Random Exp. # 2}, the
sample space will contain all possibilities and it will
be;

S={HH,HT,TH,TT)
Now if we define a following criteria, i.e.
X={Showing no. of heads in each outcome} then X
will be a random variable having these values;

X={2,1,1,0}
Finally we got the probability distribution so that
P(X=0)=0.25 are the chances of having no heads in
both tosses so on.

A Probability
Distribution

X P(X=x)
0 1/4
1 2/4
2 1/4

Game Theory (Gamblers Example)


Suppose a gambler is offering you to play a game
with him by tossing a fair coin and a fair dice,
once.
He agreed to pay you Rs/- 100 for each Head
appeared in the Coin and Rs/- 300 for a Number
greater than 4 in the Dice.
On the other hand, you will pay him Rs/- 50 for
each Tail appeared in the Coin and Rs/- 250 for a
Number less than 5 in the Dice.
Determine whether its an Expected Loss or a
Gain for you?

Gamblers Example: Working


Its a Random Experiment # 6 in which we have
tossed a Fair Coin and a Fair Dice once.
S={H1, H2, H3, H4 , H5 , H6 , T1 ,T2 ,T3,T4,T5,T6}
H=+100, T= - 50, {1,2,3,4}= - 250, {5,6}= +300
X={-150,-150,-150,-150,+400,+400,-300,,+250}
X
P(X=x)

-300
4/12

-150
4/12

+250
2/12

+400
2/12

Mathematical Expectation E(X)


In order to get the total effect of Probabilities on
the Values of X, we need to form another column:
X

P(X=x)

X P(X=x)

X2 P(X=x)

-300

0.333

-300 x 0.333

(-300)2 x 0.333

-150

0.333

-150 x 0.333

(-150) 2 x 0.333

+250

0.167

+250 x 0.167

(+250) 2 x 0.167

+400

0.167

+400 x 0.167

(+400) 2 x 0.167

E(X)=

E(X2)=

E(X) is called the Mean of Probability Distribution, to find


Variance, we need to form another column for E(X2).

Odd Rows

Quiz # 2

Suppose a gambler is offering


you to play a game with him by
tossing a Two fair coins and a
fair dice, once.
He agreed to pay you Rs/- 100
for each Head appeared in the
Coin and Rs/- 300 for a Number
greater than 4 in the Dice.
On the other hand, you will pay
him Rs/- 50 for each Tail
appeared in the Coin and Rs/250 for a Number less than 5 in
the Dice.
Determine whether its an
Expected Loss or a Gain for
you?

Even Rows

Suppose a gambler is offering


you to play a game with him by
tossing a Two fair coins and a
fair dice, once.
He agreed to pay you Rs/- 150
for each Head appeared in the
Coin and Rs/- 400 for a Number
greater than 4 in the Dice.
On the other hand, you will pay
him Rs/- 100 for each Tail
appeared in the Coin and Rs/250 for a Number less than 5 in
the Dice.
Determine whether its an
Expected Loss or a Gain for
you?

Combinatorial Problems
What are Counting Techniques?
The concept is usually use when the sample
space is too large for any Random Experiment.
When we try to explore different ways to
arrange/rearrange objects.
When we want to know how huge is the
domain of possibilities even in assigning simple
tasks to different individuals

Counting Rules
In order to understand the concept; we can
consider the following case:
If we have 3 objects A,B,C and we want to
choose 2 objects from 3.
Then we have 2 Questions before we proceed
Q1: Is duplication Allowed?
(Y/N)
Q2: Is order important (ABBA)? (Y/N)

Counting Rules (Power Principle)


If the answer of both questions is YES, i.e.
Q1: Is duplication Allowed?
(Y)
Q2: Is order important(ABBA)?
(Y)
Then the group of arrangements should be:
AA AB AC
BA BB BC
CA CB CC
Total WAYS formula will be Nr where N=3 and r=2,
therefore 32=9 Ways, it is known as POWER RULE.

Counting Rules (Permutations)


If the answer sequence is below:
Q1: Is duplication Allowed?
(N)
Q2: Is order important(ABBA)?
(Y)
Then the group of arrangements should be:
AB AC
BA
BC
CA CB
Total WAYS formula will be NPr= N!/(N-r)! , where
N=3 and r=2, therefore 3P2=6 Ways, it is known as
PERMUTATIONS.

Counting Rules (Combinations)


If the answer sequence is below:
Q1: Is duplication Allowed?
(N)
Q2: Is order important(ABBA)?
(N)
Then the group of arrangements should be:
AB AC
BC
Total WAYS formula will be NCr= N!/r!(N-r)! , where
N=3 and r=2, therefore 3C2=3 Ways, it is known as
COMBINATIONS.

Counting Rules (A Class Activity)


Once after your CLASS TEACHER says START then
you all have to Change your Seats.
Do it as soon as you can
Compute the TOTAL TIME required for all possible
arrangements.
NP x time in seconds

= Total seconds
r

(Total Seconds)/60
= Total minutes

(Total Minutes)/60
= Total Hours

(Total Hours)/24
= Total days

(Total days)/365
= Total Years required

Counting Rules (Cases)


Solve the following cases with a suitable
counting Rule:
1- How many ways are possible when we have to
decide a Batting order in a cricket team?
Answer is Permutations, because duplication is
not allowed but order matters, therefore:
10P =10! / (10-10)!=3628800 Ways
10

Counting Rules (Cases)


Determine whether the following situations would
require calculating a permutation or a combination:
a) Selecting three students to attend a conference in
Washington, D.C.
combination
b) Selecting a lead and an understudy for a school play.
permutation
c) Assigning students to their seats on the first day of
school.
permutation

Counting Rules (Cases)


Evaluate:
Answer=720
A coach must choose five starters from a team
of 12 players. How many different ways can the
coach choose the starters?
Answer=12C5=792
Which of the following is NOT equivalent to ?

Counting Rules (Cases)


The local Family Restaurant has a daily breakfast special in which the
customer may choose one item from each of the following groups:
Breakfast
Sandwich
egg and ham
egg and bacon
egg and cheese

Accompaniments
breakfast potatoes
apple slices
fresh fruit cup
pastry

Juice
orange
cranberry
tomato
apple
grape

a) How many different breakfast specials are possible?


3C x 4C x 5C =60 breakfast choices
Answer:
1
1
1
b) How many different breakfast specials without meat are possible?
1C x 4C x 5C =20 meatless breakfast choices
Answer:
1
1
1

Counting Rules (Cases)


In How many ways we can design a Cars Number Plate
if it comprises of 3 Alphabets followed by 3 Numbers?
Answer:

263x103=17576 x 1000=17576000 ways

What if duplication is not allowed in the same case?


26P x 10P = 15600 x 720 = 1123200 ways
Answer:
3
3

Counting Rules (Cases)


In How many ways we can set a Password for our
email address if it comprises of 6 letters?
Answer:
266=
What if it contains 6 letters or numbers or both?
Answer: (26+10)6 =
What if it contains 6 letters and numbers if duplication
is not allowed?
Answer: 36P6 =

A Probability Density Function (PDF)


Its a Mathematical Function which can
generate Probabilities in a Probability
Distribution.
With reference to the previous random
variable examples; we can generate the
same probabilities using a mathematical
function i.e. P(X=x)=nCx/2n.
If we put n=2 and x=0,1,2, we can
observe the same table.
For X=0, we can compute;
P(X=0)=2C0/22=1/4 and so on.

A Probability
Distribution

X P(X=x)=
2C

2
x/2

1/4

2/4

1/4

Sequence of Bernoulli Trials (By James


Bernoulli) results Binomial Random Experiment
In Dichotomous type random experiments, we
always encounter the Bernoulli trials (trials having
two possibilities, i.e. Success or Failure)
If we consider a sequence of n Bernoulli trials in
which we are having x number of successive trials
i.e.; S,S,F,F,F,S,S,F,S,.. F.
So, it must contains x successive trials and n-x
failure trials. Therefore the probabilities of
occurrence of xsuccess in n trials, we got the
following PDF:
P(X=x)=nCx px (1-p)n-x where X=0,1,2,n
Where, n showing number of independent trials and
p is the proportion of success

Binomial Random Experiment


(An Example)
P(X=x)

0.5
0.4

0.3
0.2
0.1
0
0

Suppose in a particular Open heart surgery


operation; chances of survival of patient are 70%
(p=0.7) and if 3 patients are being operated through
the same operation, then chances of survival are
given below;
P(X=0)=0.027,P(X=1)=0.189 , P(X=2)=0.441 and
P(X=3)=0.343, these results indicating higher
chances of survival of any two patients among three
and so on
MS-EXCEL Syntax is, =Binomdist(x,n,p,cumul)

Binomial Probability Distribution


(A Case)
I need to obtain a sample proportion based on a
Hand-Poll consent of the class:
Tell Me how many of you are Pro-Imran Khan as a
Leader of PTI?
Based on a Hand-poll result; we can obtain a
sample proportion .
Now, determine the probability of finding 7 proImran Khan students if we select 10 Business school
students at random? P(X=7) where n=10 and p=
7 x (1 - )3
P(X=7) = 10C7 x

Binomial Probability Distribution


(A Case)
Probability of Finding Atmost 3 pro-Imran Khan
students?
P(X3) = P(X=0)+P(X=1)+P(X=2)+P(X=3)
Which is similar to P(X<4)
Probability of Finding Atleast 7 pro-Imran Khan
Students?
P(X7)=P(X=7)+P(X=8)+P(X=9)+P(X=10)
Which is similar to P(X>6)
Probability of Finding 6 to 8 Pro-Imran Khan students?
(If nothing is written then default is inclusive)
P(6 X 8)=P(X=6)+P(X=7)+P(X=8)

Mathematical Expectation
(A Binomial Distribution case)
As we know the Mean and Variance of any
Probability distribution is:
E(X)= X P(X=x) and V(X)=E(X2) [E(X)]2
But Using Binomial PDF, we can compute both
measures in terms of Parameters:
E(X)=np and V(X)=np(1-p)
Determine the Average number of Pro-Imran
Khan Students in the group of 10.

The Poisson Distribution


Poisson was the French Mathematician.
He Worked on the Binomial PDF and
obtained its LIMITING FORM by putting
n and p0.
Poisson Probability Density function can be
written as: P(X=x)=exp(-) x/x!

Domain for Poisson PDF is 0X


Where is the Parameter.
showing the rate of occurrence or Average.

The Poisson Distribution


(Working with PDF)
If the value of Parameter is given i.e. =2, then
find the Probability distribution of X.
X

P(X=x) 0.1353 0.2707 0.2707 0.1804 0.0902 0.0361 0.0120 0.0034 0.0009

Determine the Probabilities of X using Binomial


PDF with n=50 and p=0.04.
Did you find any similarity b/w Binomial and
Poission Probabilities??
Its due to the Asymptotic nature as
E(X)=np=50x0.04=2 which is equals to given .

The Poisson Distribution


(Cases)
If a secretary is making 2 mistakes per page,
then determine the probability that she will
make no mistake in the next page?
P(X=0) with =2.
According to the survey, it is found that there
are approx. 4 field mice per acre of land then
determine the probability that on the next
acre of land, there will be atmost 2 field mice
found?
P(X2) with =4.

The Normal (Gaussian) Distribution


(Distribution of a continuous random variable)
Bell-shaped distribution or curve
Perfectly symmetrical about the mean.

Mean = median = mode


Tails are asymptotic: closer and closer to horizontal
axis but never reach it. Approximate domain
formula is -3 X +3

The Normal Probability Density Function


The PDF is written as:

Where and are two parameters which are


Mean and Standard Deviation, respectively.
Simplify the f(X) if =0 and =1?
Simplified form is said to be the Standard Normal
Distribution.

Normal curves and probability

Finding Area Under the Standard Normal


Curve
Determine the following Areas/probabilities using the
Standard Normal Table:
1- P(Z1.25) =
2- P(Z< -1.00) =
3- P(Z= -1.00) =
4- P(Z +1.00)
=
Solution,
P(Z +1.00)
= 1 P(Z< +1.00)
= 1 0.8413 = 0.1587
Theorem: P(Z +1.00) = P(Z -1.00)

Finding Area Under the Standard Normal


Curve
Determine the following Areas/probabilities using the
Standard Normal Table:
5- P(-1.00 Z +1.00) =
Solution,
P(-1.00 Z +1.00) = P(Z +1.00) P(Z < -1.00)
Theorem:
P(a Z b) = P(Z b) P(Z < a)
6- P(-2.25 Z -1.00) =

Observing Quantiles (Inverse


consideration of Standard Normal Table)
Determine the following Quantiles/Percentage
Points/Z-scores using the Standard Normal Table:
7- P(Z a) = 0.025
Z

0.09

0.06

-3.9
..
-1.9

0.025

Therefore, the answer will be a= -1.96

0.00

Observing Quantiles (Inverse


consideration of Standard Normal Table)
8- P(Z b) = 0.05
Z

0.09

0.05

0.04

0.0495

0.0505

0.00

-3.9
..
-1.6

Therefore, b = -[1.6 + (0.04+0.05)/2] = -1.645


Elsewhere we can also consider the nearest value.

Normal Distribution (Cases)


Soft-drink Analysis from KU canteens
Amount of soft-drink within a glass follows a
Normal Distribution with =220 ml. and =5 ml.
If a student purchases one glass of soft-drink then
determine the probability that he will get less than
215 ml within his glass:
P(X<215) = ??
We must use the z-transformation: Z = (X-)/, so:
P[(X-)/ < (215-220)/5]
=
P( Z
<
- 1.00 )
= 0.1587

Normal Distribution (Cases)


Soft-drink Analysis from KU canteens
P(X<215) = 15.87%
1- There is a 16% chance that he will get less than
215ml within his glass.
2- We are 16% confident that he will get less than
215 ml. within his glass.
3- If 50 students purchasing 50 glasses of soft-drink
then approx. 50 x 0.1587 8 of them will be
having less than 215 ml. within their glasses
Find: P( 215 X 225 ) = P(X 225) P(X<215)

Normal Probabilities Using MS-EXCEL


For any Normal distribution with =250 and
=5, we can obtain the P(X<245) using the
following syntax:
=Normdist(x,,,cumulative)
=Normdist(245,250,5,1)
And for P(X>255)
=1 - Normdist(255,250,5,1)
We can apply the same scenario on a soft-drink case study.

Index Numbers
Index Numbers are RELATIVE measures.
Index Numbers Could be Price Relatives or
Quantity Relatives.
Index Numbers are having two major types:
1) Simple Index
2) Composite Index
Simple Index Number can be obtained using
this formula: In=Pn/P0100 where,
Pn is the current year (time) and Po is the Base year (time)

Simple Index (Example)


Consider the following table comprising prices of a
commodity in different years:
Years

Price
(Rs/-)

Fixed Base
In=Pn/54 100

Chain Base
In=Pn/Pn 100

2006

54

=54/54 100= 100.0%

=54/54 100= 100.0%

2007

60

=60/54 100= 111.1%

=60/54 100= 111.1%

2008

67

=67/54 100= 124.1%

=67/60 100= 111.7%

If we want to use a Fixed base method by fixing the base year


as 2006 then the possible Indices will be computed by dividing
all Price values with 54.
In Chain base method; the preceding year price will be used
as base.

Composite Index (Example)


Consider the following table comprising prices of a
commodity in different years for three different cities:
Years

Price
City1

Price
City2

Price
City3

Sum
P

Fixed Base
In=Pn/156 100

2006

54

52

50

156

100%

100%

2007

60

65

62

187

119.9%

119.9%

2008

67

65

68

200

128.2%

106.9%

Chain Base
In=Pn/P0 100

Before computing the fixed base or chain based index


numbers, we have to obtain a sum for all prices in the next
column.
Finally we can compute both Fixed base and chain base
indices for the P column using the same procedures.

Potrebbero piacerti anche