Essential Stats For Decision Making-1 Descriptive Stats-2011

In the Name of ALLAH, the beneficent, the Merciful,
O Allah, send your salutations upon Muhammad (PBUH) & on the Family
of Muhammad (PBUH) as you sent your salutations upon Ibrahim & on the
Family of Ibrahim verily you are Most Praiseworthy & Glorious
Quantitative Methods for

Decision Making
A Practical and Philosophical approach
By,
Yaseen Ahmed Meenai
Faculty, FCS-IBA
ymeenai@iba.edu.pk
What is Statistics (A science or an art?)

An activity of obtaining data and then;
Compiling, summarizing, presenting, analyzing,
interpreting and.
Drawing conclusions, is called Statistics.
In short it is;
Data Process Information/Conclusions
Statistics is sort of a mixture of science and art,
till process it is a SCIENCE and drawing
conclusions is an individuals ART.
What is DATA (A word or a Keyword?)

DATA is a group of raw fact and figures which
may VARY from;
Person to Person, Object to Object, Distance
to Distance and Time to Time.
Only the absence of VARIATION can cause a
CONSTANT and it doesnt exists in our physical
world. Only spiritualism can define a
CONSTANT.
Data v/s Variable

Variable is the storage of data, its being represented by letters X,Y,Z etc.
There are two types of variables:
Qualitative Variable: It deals with the data which may vary by it kind,
which provides labels, or names, for categories of like items, i.e. a set of
observations where any single observation is a word or code that
represents a class or category.
Gender, Complexion, Weather, Type are some examples
Quantitative Variable: It deals with the numeric data, which measures
either how much or how many of something, i.e. a set of observations
where any single observation is a number that represents an amount or a
count.
Age, Height, number, price are some examples of Quantitative variable.
Source: http://www.microbiologybytes.com/maths/1011-17.html
Inactivity breaker
Object: Allocate a blank page from your writing material and divide that page into
two columns in the following manner:
Qualitative Variables
Quantitative Variables
1- Gender
1- Age
2- Complexion
2- Height
3- Qualification
3- Weight
4- Weather
4- Price
20.
20.
Try to write atleast 20 variables in each column by observing several fields like
management, agriculture, medical, engineering, geology etc. Submit the same
sheet by writing your full name on the top.
Data Sources
There are three major sources of data:
1. Survey/Census: An official, usually periodic
enumeration of a population, often including
the collection of related demographic
information, is called census. Survey means to
inspect and determine the conditions of
interest.
2. Experiment: Any activity, which is usually being
conducted within an isolated atmosphere, and
produces results, is called experiment.
3. Simulation: An artificial way of data collection.
Data Collection/compilation
Teaching Ranks where 1-Very Poor, 5-Excellent
4.5
3.7
4.3
3.3
2.7
4.7
3.8
4.5
3.4
4.0
3.8
2.7
4.3
3.4
3.2
3.7
3.9
3.8
3.8
3.7
3.6
5.0
4.2
4.1
4.2
4.1
3.9
4.5
5.0
3.7
4.8
3.2
4.2
4.5
4.2
5.0
2.9
Data collection/compilation is needed for getting
actual behavior of the variable.
Note: The above data is simulated version of the actual.
Data Tabulation (Grouping Exercise)

Step # 01: Finding the range
Range = Max. Min
= 5.0-2.7 =2.3
Step # 02: Finding the number of classes

No. of classes = 1 + 3.3 log(n) = 1+3.3 log(37)
= 6.175
Step # 03: Finding the width or height (h)

h = Range/No. of classes= 2.3/6.175
= 0.377 0.4
Class Interval:
One of the intervals into which the range of a variable

of a distribution is divided, esp. one of the divisions
of the base line of a bar chart or histogram.
After forming the structure of Class-Intervals and frequencies by using
methods of tally-marks, we can observe the actual behavior.
Data Process Information

Histogram
Frequency
2.7
3.1
3.4
10
3.8
4.2
12
10
Frequency
Ranks
8
6
4
2
0
2.7
4.6
3.1
3.4
3.8
4.2
4.6
Ranks
The above mentioned frequency distribution table and the Histogram are
revealing the shape of thoughts generated from the minds of students. If we
discover a subsequent Mathematical Model, it will called a Probability
distribution.
Data Process Information

Histogram
Frequency
2.7
3.1
11
3.4
3.8
4.2
12
10
Frequency
Ranks
8
6
4
2
0
2.7
4.6
3.1
3.4
3.8
4.2
4.6
Ranks
The above mentioned frequency distribution table and the Histogram are
revealing the shape of thoughts generated from the minds of students. If we
discover a subsequent Mathematical Model, it will called a Probability
distribution.
Grouping the data (MSEXCEL)
Data Analysis option is located in the Data menu, in case if it is not

present there we can activate it by running the Add-Ins present in
Excel Options.
Grouping the data (MSEXCEL) cont

After
providing
data-range and
hitting the Labels
and Chart-output
options, we can
find the histogram
either in the new
worksheet or in the
specific place of the
existing sheet.
Bin numbers These numbers represent the intervals that you want the
Histogram tool to use for measuring the input data in the data analysis.
Statistical Measures (An introduction)

The phrase descriptive statistics is used generically in place
of statistical measures.
These statistic(s) describe or summarize the qualities of
data.
Another name is summary statistics, which we mostly used
to ornament our reports/cases/research.
This would be beneficial if graphical summary is not enough
sufficient for the final conclusions.
Data
Processing
Processing
By Graph
By Measure
Conclusions
Statistical Measures (An Example)

Consider the following group data:
Class
Intervals
Frequency
24
46
68
810
1012
2
5
9
7
2
f=25
Relative
Frequency
(R.F.)
2/25 = 0.08
5/25 = 0.20
9/25 = 0.36
7/25 = 0.28
2/25 = 0.08
R.F.=1
Cumulative
Relative Frequency
(C.R.F)
0.08
0.28
0.64
0.92
1.00
The above data showing Income in 1000s of Rupees of some individuals in

late 1980s
Statistical Measures (Quartiles)

These are 3 values respectively represented by Q1, Q2
and Q3 and divides the data into 4 equal parts.
Each part contains 25% observations
Quartiles Usually highlight 4 different classes i.e.
Lower class, Lower Middle, Upper Middle and Upper
class.
25%
Lower
Class
Min
25%
Lower
Middle
Q1
25%
Upper
Middle
Q2
25%
Upper
Class
Q3
Max
Computing Quartiles
In order to computer Quartile Values, we need to
consider the same frequency distribution in addition to
the column of Cumulative Frequency.
Class
Intervals
24
46
68
810
1012
Frequency
2
5
9
7
2
f=25
Cumulative
Frequency (C.F.)
2
7
16
23
25
Computing Quartiles (Procedure)

For any group-data, quartiles can be computed by following two
simple steps:
Step-1: Finding the location of ith Quartile: (where i=1,2 and 3)
Step-2: Finding the value of ith Quartile:
Where l = lower limit of captured class, h=class-width, f=class

frequency, C.F.=previous class C.F.
Computing Quartiles (Demo)

Class
Intervals
24
46
68
810
1012
Frequency
Cumulative
Frequency (C.F.)
2
7
16
23
25
2
5
9
7
2
f=25
Step-1 (For Q1): (1 x 25) / 4 = 6.25
Step-2: Q1=4+2/5 (6.25 - 2) = 5.7

Note: Class width=h=2
1st
Quartile
Class
Quartiles (Income Classes)

25%
Lower
Class
25%
Lower
Middle
25%
Upper
Middle
25%
Upper
Class
Min
Q1
Q2
Q3
Max
2000
5700
7222
8786
12000
Quartiles can be computed using MSEXCEL, ungroup

form of data is needed there, the syntax is given below:
=QUARTILE(Data Range,i) where i=1,2,3 showing
quartile numbers.
Exploratory Data Analysis (EDA) by Sir

John Wilder Tukey
There are two types of studies:
Hypothetical Study
Exploratory Study
In Exploratory study, we can perform our
analysis
by
avoiding
conventional
methodologies. In EDA, we can observe the
trend of data by applying different processes
on the data.
The Box-plot is a very useful part of EDA.
The Box-Plot
Boxplot of Teaching
Inter-quartile Range=Q3-Q1
Min
Q1
3
Q2
Q3
4
Teaching Ranks
Max
5
Exploratory Analysis for Quality ranks

from Aventis Field Managers
Boxplots of Teaching, Administration & Structure
(means are indicated by solid circles)
5
Structur
Admin
Teaching
Statistical Measures (Central Tendency)
(Mean, Median and Mode)

The main problem associated with the mean
value of some data is that it is sensitive to
outliers.
The median is simply the middle value
among some scores of a variable. Its the 2nd
Quartile (Q2) of any data.
The most frequent response or value for a
variable. Multiple modes are possible:
bimodal or multimodal.
Mean, Median and Mode

Measurements are on x-axis and frequencies are on y-axis
The Mode is based on the principal of democracy, while

median (Q2) follows the rule of moderation. Mean took its
place after being influenced by the higher values of
measurements. The above mentioned distribution is +vely
skewed.
Mean and Mode (Computations)

Modal
Class
Class
Intervals
24
Frequency
fi
2
46
f i xi
Mid-Points
xi
(2+4)/2
=3
23
f1=5
(4+6)/2
=5
55
68
fm=9
(6+8)/2
=7
97
810
f2=7
(8+10)/2
=9
79
1012
(10+12)/2
=11
211
f i xi=179
fi=25
Mode
f m f1
h
l
2 f m f1 f 2
7.333
Mean
fx
f
i i
i
179
7.160
25
AverageIncomeof the
Majority' s Incomeis 7333 Rs /
communityis 7160Rs /
Empirical relationship b/w

Following are the values for Mean, Median
and Mode obtained from the Income data:
fx
Mean
f
i i
i
179
25
7.160
Median Q2
7.222
f m f1
h
Mode l
2 f m f1 f 2
7.333
Mean Median Mode (Thus the data is slightly vely skewed )

MSEXCEL syntaxes for finding three measures
of central tendency are;
=Average(Data Range)
=Quartile(Data Range,2)
=Mode(Data Range)
For Mean
For Median
For Mode
Statistical Measures (Dispersion)

What is DISPERSION??
A dart-game can help us in this
Based on the visual observation; we
can declare Player-A as a winner
because:
Player A is,
More consistent/Less
Variable/Homogenous/Less Dispersed
And
Player B is,
Less Consistent/More
Variable/Heterogeneous/More
dispersed
Measures of Dispersion
Some Important Measures of Dispersion are:
Range=Max-Min
Variance
Standard Deviation
Mean Deviation
Inter-quartile Range
Coefficient of Variation (C.V.)
Dispersion Measures (Cont)

2
xi x
Variance V ( X )
Variance of the following

ungroup data:
X: 1,2,3,4,5
Mean=3
V(X)=[(1-3)2+(2-3)2+(33)2+(4-3)2+(5-3)2]/5= 2
Standard Deviation=
=1.414 ???
V (X )
Coefficient of Variation (Consistency Check)

In order to check whether the variable is
consistent or not, we need to computer the
coefficient of variation,
V (X )
C.V .
100 100
X
For any consistent variable, C.V. < 100%

C.V. is the unit-less measure of dispersion.
Variance & Standard deviation (group-data)

f i xi
f i (xi-mean)2
(2+4)/2=3
23
2(3 - 7.16)2=34.61
(4+6)/2=5
55
5(5 - 7.16)2=23.33
68
(6+8)/2=7
97
9(7 - 7.16)2=0.230
810
(8+10)/2=9
79
7(9 - 7.16)2=23.69
1012
(10+12)/2=11
211
f i xi=179
2(11 - 7.16)2=29.49
Class
Intervals
Frequency
fi
24
46
Mid-Points
xi
fi=25
f x x
Variance V ( X )
f
=111.34
111.34
4.45
25
Variable Comparison (Property of C.V.)

Coefficient of Variation for 1,2,3,4,5 is,
V (X )
1.414
C.V .
100
100 47 .1%
X
3
And for the Income-data; it is,
V (X )
2.111
C.V .
100
100 29 .48 %
X
7.16
So technically, Income data is more consistent
than the first five natural numbers.
Hand-Profile Analysis
(An exploratory approach)
X4
X3
S.No.
X2
X5
Span (X6)
Length
(X7)
Thumb
(X1) in
cms
Measurements (X)
X1
X2
X3
X4
X5
X6
X7
Determine
the Mean,
Standard deviation and
Coefficient of Variation.
Moments Based Skewness

Moments Based Skewness Measure:
This will always gives us a +ve or a zero value
For symmetric distribution, this measure will be

zero as 3=0.
For any skewed distribution; 1 will be having a
+ve value and the sign of 3 will indicate the
direction of skewness i.e. for any -ve skewed
distrbution; 3 will be -ve and vice versa for +vely
skewed asymmetric distribution.
Moments Based Kurtosis

What is Kurtosis??
For Laptokurtic state

(Less Dispersed)
For Mesokurtic state
(Normally Dispersed)
For Platykurtic state
(More dispersed)
2>3
2=3
2<3
Approximate Confidence Interval

For any Bell-shaped symmetrical distribution;
the following will be proved:
1) will cover approximately 68% observations
2) 2 will cover approximately 95% observations
3) 3 will cover approximately 99.98% observations
Where and are the mean and standard deviation
respectively.
Why Bell-Shaped Symmetrical

Distribution??
In a Bell-shaped distribution, extreme values come
with less frequency.
Majority falls within one standard deviation.
Its Natures Distribution. God created almost all
natural measures with a bell-shaped distribution.
Empirical Proof for the Approx.

Confidence Intervals
Bring One Neem Leaf and measure its length in
cms.
Obtain Mean and Standard Deviation
Empirically prove the following theorems:
1) will cover approximately 68% observations
2) 2 will cover approximately 95% observations
3) 3 will cover approximately 99.98% observations
H.W. (Group the data and prove that its Bellshaped symmetric in nature)
Statistical Process Control (SPC)

The Concept is based on Approximate
Confidence intervals.
Its usually use to monitor a manufacturing
Process
or
to
observe
individuals
performance.
For this purpose, we setup a graph which is
called a Control Chart.
Control Chart is bounded by two Control
Limits.
A Control Chart
Upper control Limit
Theoretical /Claimed
Value
+3
A Realization
- 3
Lower control Limit
By observing any realization; we can monitor any process

which can alert us on two conditions:
1- Either any observation crosses or even touch any prealarm control limit. Or
2- When the realization motion become rhythmic
Statistical Process Control (An activity)

Consider the following Manufacturing
Process;
X=2 x Ran#
Simulate 7 Observations using this
simulator.
Obtain a Control Chart using these
parameter values; =1, =0.3.
Deduce whether your process is
under control or not. Comments on
your Realization.
S.No. X=2xRan#
1
2xRan#
2xRan#
2xRan#
2xRan#
2xRan#
2xRan#
2xRan#
A Class Activity
Write your name on the Top of todays Class
Work
Keep your Class Work open on your desk.
Leave your seat and check atleast one of your
classmate copy and write your remarks about
him/her on a chit.
Submit your remarks-chit to me by writing the
name of that classmate.
Introduction to Probability
It is the science in which either we study a
random experiment or we observe a random
phenomenon.
In probability study, a sample space is needed
which is the set of all possible outcomes of
any random experiment.
It is the connectivity b/w Descriptive and
Inferential Statistics.
Logical Thinking motivation

Drawing a FISH can help us understand the logical
thinking:
Now, try to re-draw the same fish, but without

lifting your pen once it touches the paper and
without striking out any of your drew line.
Logical Thinking through the Venn

diagram
A Venn diagram is a rectangular area showing
the Sample Space & having some circles inside
(usually overlapped) which are showing the
Events.
S
B
A
c
S={a,b,c,d,.,n}
d,e
a,b
g,h
A={a,b,c,f,g,h}
i
f
B={c,d,e,g,h,i}
J,k
l,m,n
C={f,g,h,I,j,k}
C
Shading the Venn Diagram

S
B
For
should
For AB
AB,
it, itshould
AB,
A
, it should
bebebe
For
AB,
The Demorgans Law
Probability Topics Tree

Random
Experiment
Expectation
Probability
Distribution
Random
Variable
Criteria
Numeric
Counting Rules
Sample
Space
Events
Outcomes
Mutually Exclusive (Non Overlapping)
Non Mutually Exclusive (Overlapping)

Probability
Independent
P(AB)=P(A) P(B)
Dependent
Conditional Probability
What is the Distribution?

Gives us a picture of
the variability
and central tendency.
Can also show the

amount of skewness
and Kurtosis.
Bell-Shaped Symmetrical Distribution
Central Tendency
Dispersion
2
3
Probability Distributions
For any frequency distribution, we need a
variable while for any probability distribution,
we need a random variable
Random Variable is the data which can be
obtained by converting the outcomes of any
sample space into numeric codes after defining
a particular criteria, so;
Random Experiment is necessary for a
probability distribution
Any Experiment with uncertain results

(outcomes) called a random experiment
For example, mixing acid and base will
produce salt and water (Its an experiment)
but;
Tossing a Dice or a Coin, or Drawing a card
from well shuffled deck will produce a random
result (these are examples of random
experiments), so in each random experiment,
we collect all possibilities (outcomes) and
make a sample space
Formation of Sample Spaces

Random Experiments Related to a Fair Coin:
Random Experiment # 1: Tossing a fair-coin once
S={H,T}
21=2 outcomes
Random Experiment # 2: Tossing a fair coin twice or tossing 2
fair coins, once.
S={HH, HT, TH, TT}
22=4 outcomes
Random Experiment # 3: Tossing a fair coin thrice or tossing 3
fair coins, once.
S={HHH, HHT, HTH, THH, THT, TTH, HTT, TTT} 23=8 outcomes
In general, 2n showing the two sided coin is being tossed n times
Formation of Dichotomous SS
A truth Table can help us forming the sample
space: For e.g. Sample Space of Rand. Exp. # 3.
The formation rule is simple S. No. 1 2 3
1
H
H
H
Values of Every next column
2
T
H
H
should be doubled of the
3
H
T
H
preceding column.
4
T
T
H
5
H
H
T
Outcomes can be observed
6
T
H
T
Horizontally.
7
H
T
T
st
nd
rd
Random Experiments with Dice

Random Experiment #4: Tossing a fair dice, once
S={1,2,3,4,5,6} 61=6 outcomes
Random Experiment #5: Tossing a fair dice, twice or
Tossing two fair dice once
S={11, 12, 13, 14, 15,16
21, 22, 23, ,26
.....
61, 62, 63, ., 66}
62=36 outcomes
Random Experiments Contd..

Random Experiment #6: Tossing a fair coin and a fair
dice, once
S={H1,H2,H3,H4,H5,H6,T1,T2,.T6} 21 x61=12 outcomes
Random Experiment #7: Tossing 2 fair coins & a fair dice
once.
S={HH1,HH2,HH3,HH4,HH5,HH6
HT1,HT2,HT3,,HT6
..
TT1,TT2,.,TT6}
22x61=24 outcomes
Random Experiments A Deck of Cards

Random Experiment #8: Drawing a card from a
well shuffled Deck of playing cards.
S={ Hearts
Diamonds
Clubs
Spades
King+Queen+Jack+Ace+2+3++10
King+Queen+Jack+Ace+2+3++10}
Total=
13
13
13
13
52
Formation of Events
What is an Event?
Replicate the same work for

Random Experiment #3
Its a logical statement which should be followed, strictly

We always collect the matching outcomes from the sample
space after viewing the Event statement.
VENN Diagram
For e.g. if we consider the Random Exp. # 2:
B
A
Object: Tossing a fair coin twice, S={HH,HT,TH,TT}
TH
HH HT
Event(s):
A={First toss should be a Head}
TT
A={HH, HT}
B={Exactly one Tail in the outcome}: B={HT,TH}
Thus we formed two Non-Mutually Exclusive Events
Computing Probability
Probability of an Event
P(A) stands for probability of an Event A such that;
P(A) = n(A)/n(S)
Where,
n(A) is the number of outcomes present in Event A.
n(S) are the number of outcomes present in the
Sample Space.
Probability is a proportion of Event in a Sample Space.
For any Event A; 0 P(A) 1 where A S
Computing Probabilities (Example)

Random Experiment # 2: Tossing a fair coin twice or
tossing two fair coins, once.
Sample Space
S={HH,HT,TH,TT},
Event(s)
A={First toss should be a Head},
A={HH, HT}
B={Exactly one Tail in the outcome}: B={HT, TH}
Therefore Probabilities will be,
P(A)=2/4=0.5
50% chances
P(B)=2/4=0.5
50% chances
Interpreting Probability
Probability occurs against every Event and should be interpreted
in 3 components;
1) Object of the Random Experiment
2) Value of the Probability
3) Event Statement
For e.g., Interpretation of P(A)=0.5 can be written as;
If we toss a fair coin twice, we have 50% chances

of getting head in the first toss.
Similarly, P(B)=0.5 would be:
If we toss a fair coin twice, we have 50% chances

of getting exactly one tail in both tosses.
Union, Intersection and Compliment

For the same Random Experiment # 2, the following
operations showing results and relevant interpretations
needed (where U=OR, =AND, A=not(A):
Since
S={HH,HT,TH,TT}
A={HH,HT} B={HT,TH}
Therefore,
AUB={HH,HT,TH}
P(AUB)=3/4=0.75 75%
If we toss a fair coin twice, we have 75% chances of getting
head in the first toss OR exactly one Tail in both tosses.
AB={HT}
P(AB)=1/4=0.25 25%
A=S-A={TH,TT}
P(A)=2/4=0.50
50%
P(A)=1-P(A)
Practice Questions
Q1) If we toss a fair coin three times, determine the
following probabilities:
a) P(A)=Probability of getting exactly one Head in all tosses?
b) P(B)=Probability of getting Tail in the first toss?
c) P(C)=Probability of getting exactly one head AND one
tail?
P(One head One Tail)
d) P(D)=Probability of NOT getting exactly one head in all
tosses? P(A)
e) P(F)=Probability of Either getting exactly one head in all
tosses OR tail in the first toss?
P(AUB)
Practice Questions (Contd..)

Q2) If we toss a fair dice twice, determine the following
Probabilities: (Ref. Random Experiment #4)
a) P(A)=Probability of getting same number on both Dice?
b) P(B)=Probability of getting odd number in both Dice?
c) P(C)=Probability of getting sum of both numbers equals
to 5?
d) P(D)=Probability of getting an odd number AND an even
number on two Dice respectively.
e) P(F)=Probability of NOT getting the same number on
both Dice?

Q3) If we toss a fair COIN and a Fair DICE once, determine
the following Probabilities: (Ref. Random Experiment #6)
a) P(A)=Probability of getting exactly One head in the coin?
b) P(B)=Probability of getting an odd number on Dice?
c) P(C)=Probability of getting exactly one Head with an Odd
number on Dice? P(AB)
d) P(D)=Probability of getting a number less than 4 on Dice.
e) P(F)=Probability of NOT getting exactly one Head in the
coin? P(A)=1-P(A)

Q3) If we toss two fair COINS and a Fair DICE once,
determine the following Probabilities: (Ref. Random
Experiment #7)
a) P(A)=Probability of getting exactly One head in the coin?
b) P(B)=Probability of getting an odd number on Dice?
c) P(C)=Probability of getting exactly one Head with an Odd
number on Dice? P(AB)
d) P(D)=Probability of getting a number less than 4 on Dice.
e) P(F)=Probability of NOT getting exactly one Head in the
coin? P(A)=1-P(A)
A VENN diagram case with a Deck

If we draw one card from
the following events:
K={it will be a King}
K={4 cards}
A={it will be an Ace}
A={4 cards}
B={it will be a Black Card}
B={26 cards}
D={it will be a Diamond}
D={13 cards}
a well-shuffled deck, determine
E={it will be a card numbered

from 3 to 5}
E={12 cards}
Show a Venn diagram
containing these 5 Events
22
B
22

D
11
K
1
A
1
22
B
11

D
8
2
E
16
6
B
3
Deck of playing card, an example

P(BE)=6/52=0.115
Interpretation: When we draw a card from a wellshuffled deck, we have 11.5% chances of getting a
Black card which is numbered b/w 3 to 5.
Independence/Dependence
A Contingency table can help us to understand
the concept of Independent or Dependant
Events.
A contingency table is a Bivariate Frequency
table showing a joint Distribution of two
variables.
Usually two Qualitative variables can be used
to form a Contingency Table.
Contingency Table (An Example)

Consider the following table which is
representing Gender (Male/Female) and the
Eyesight Status (Glasses/No Glasses):
Gender/EyeSight
Male (M)
Female (F)
R. Total
R.Total
Glasses (G)
05 MG
12 FG
G =17
17
No Glasses (NG)
09 MNG
19 FNG
NG=28
28
Column Total
M=14
14
F=31
31
S =45
45
Conditional Probability (Example Exercise)

Q: If we select a person at random from this
community then determine that probability that the
selected person will be;
a) A Male?
Ans. P(M)= 14/45=0.31
31%
b) A Male with Glasses?
Ans. P(MG)= 05/45=0.11 11%
c) A Male given that He must be wearing Glasses?
Ans. P(M/G)=05/17=0.294 29.4%
P(M/G)=P(MG)/P(M)=(05/45)/(14/45)=0.294
(Independence Check)
If Gender is independent of Eyesight, then the
following will be proved:
P(M/G)P(M)
We considered this empirically and got this result:
0.294 0.31
Therefore we can say that Gender is Independent of
Eyesight which is quite obvious.
This might be possible to get different answers for
both simple and conditional probabilities if we
consider the case of Gender v/s Heart Disease.
(Independence Check)
We might be having different result if we make a
minor change in the following table:
Gender/EyeSight
Male (M)
Female (F)
R. Total
Glasses (G)
05
19
12
G =24
=17
No Glasses (NG)
09
12
19
NG=21
NG=28
Column Total
M=14
F=31
S =45
Now, if we can observe the dependence by observing the following result;

P(M/G)=05/24=0.208 is not equals P(M)=14/45=0.31
Hence Gender and Eyesight are dependent.
(Class Activity)
Generate the following Bivariate Data by recalling
your memories: S.No Gender (M/F) Like/Dislike
.
50
Form a bivariate frequency table and test whether Gender is

Independent of your views or not, i.e. :
P(M/L)P(M) or P(M/L)P(M)
Independent results showing no gender discrimination in MIND, vice
versa for Dependence.
Random Variables (Example)

Replicate the same for Random Exp. # 3
If we toss a fair coin twice {Random Exp. # 2}, the
sample space will contain all possibilities and it will
be;
S={HH,HT,TH,TT)
Now if we define a following criteria, i.e.
X={Showing no. of heads in each outcome} then X
will be a random variable having these values;
X={2,1,1,0}
Finally we got the probability distribution so that
P(X=0)=0.25 are the chances of having no heads in
both tosses so on.
A Probability
Distribution
X P(X=x)
0 1/4
1 2/4
2 1/4
Game Theory (Gamblers Example)

Suppose a gambler is offering you to play a game
with him by tossing a fair coin and a fair dice,
once.
He agreed to pay you Rs/- 100 for each Head
appeared in the Coin and Rs/- 300 for a Number
greater than 4 in the Dice.
On the other hand, you will pay him Rs/- 50 for
each Tail appeared in the Coin and Rs/- 250 for a
Number less than 5 in the Dice.
Determine whether its an Expected Loss or a
Gain for you?
Gamblers Example: Working

Its a Random Experiment # 6 in which we have
tossed a Fair Coin and a Fair Dice once.
S={H1, H2, H3, H4 , H5 , H6 , T1 ,T2 ,T3,T4,T5,T6}
H=+100, T= - 50, {1,2,3,4}= - 250, {5,6}= +300
X={-150,-150,-150,-150,+400,+400,-300,,+250}
X
P(X=x)
-300
4/12
-150
4/12
+250
2/12
+400
2/12
Mathematical Expectation E(X)

In order to get the total effect of Probabilities on
the Values of X, we need to form another column:
X
P(X=x)
X P(X=x)
X2 P(X=x)
-300
0.333
-300 x 0.333
(-300)2 x 0.333
-150
0.333
-150 x 0.333
(-150) 2 x 0.333
+250
0.167
+250 x 0.167
(+250) 2 x 0.167
+400
0.167
+400 x 0.167
(+400) 2 x 0.167
E(X)=
E(X2)=
E(X) is called the Mean of Probability Distribution, to find

Variance, we need to form another column for E(X2).
Odd Rows
Quiz # 2
Suppose a gambler is offering

you to play a game with him by
tossing a Two fair coins and a
fair dice, once.
He agreed to pay you Rs/- 100
for each Head appeared in the
Coin and Rs/- 300 for a Number
On the other hand, you will pay
him Rs/- 50 for each Tail
appeared in the Coin and Rs/250 for a Number less than 5 in
the Dice.
Determine whether its an
Expected Loss or a Gain for
you?
Even Rows
Suppose a gambler is offering

you to play a game with him by
tossing a Two fair coins and a
fair dice, once.
He agreed to pay you Rs/- 150
for each Head appeared in the
Coin and Rs/- 400 for a Number
On the other hand, you will pay
him Rs/- 100 for each Tail
appeared in the Coin and Rs/250 for a Number less than 5 in
the Dice.
Determine whether its an
Expected Loss or a Gain for
you?
Combinatorial Problems
What are Counting Techniques?
The concept is usually use when the sample
space is too large for any Random Experiment.
When we try to explore different ways to
arrange/rearrange objects.
When we want to know how huge is the
domain of possibilities even in assigning simple
tasks to different individuals
Counting Rules
In order to understand the concept; we can
consider the following case:
If we have 3 objects A,B,C and we want to
choose 2 objects from 3.
Then we have 2 Questions before we proceed
Q1: Is duplication Allowed?
(Y/N)
Q2: Is order important (ABBA)? (Y/N)
Counting Rules (Power Principle)

If the answer of both questions is YES, i.e.
(Y)
Q2: Is order important(ABBA)?
(Y)
Then the group of arrangements should be:
AA AB AC
BA BB BC
CA CB CC
Total WAYS formula will be Nr where N=3 and r=2,
therefore 32=9 Ways, it is known as POWER RULE.
Counting Rules (Permutations)

If the answer sequence is below:
(N)
(Y)
AB AC
BA
BC
CA CB
Total WAYS formula will be NPr= N!/(N-r)! , where
N=3 and r=2, therefore 3P2=6 Ways, it is known as
PERMUTATIONS.
Counting Rules (Combinations)

If the answer sequence is below:
(N)
(N)
AB AC
BC
Total WAYS formula will be NCr= N!/r!(N-r)! , where
N=3 and r=2, therefore 3C2=3 Ways, it is known as
COMBINATIONS.
Counting Rules (A Class Activity)

Once after your CLASS TEACHER says START then
you all have to Change your Seats.
Do it as soon as you can
Compute the TOTAL TIME required for all possible
arrangements.
NP x time in seconds
= Total seconds
r
(Total Seconds)/60
= Total minutes
(Total Minutes)/60
= Total Hours
(Total Hours)/24
= Total days
(Total days)/365
= Total Years required
Counting Rules (Cases)

Solve the following cases with a suitable
counting Rule:
1- How many ways are possible when we have to
decide a Batting order in a cricket team?
Answer is Permutations, because duplication is
not allowed but order matters, therefore:
10P =10! / (10-10)!=3628800 Ways
10

Determine whether the following situations would
require calculating a permutation or a combination:
a) Selecting three students to attend a conference in
Washington, D.C.
combination
b) Selecting a lead and an understudy for a school play.
permutation
c) Assigning students to their seats on the first day of
school.
permutation

Evaluate:
Answer=720
A coach must choose five starters from a team
of 12 players. How many different ways can the
coach choose the starters?
Answer=12C5=792
Which of the following is NOT equivalent to ?

The local Family Restaurant has a daily breakfast special in which the
customer may choose one item from each of the following groups:
Breakfast
Sandwich
egg and ham
egg and bacon
egg and cheese
Accompaniments
breakfast potatoes
apple slices
fresh fruit cup
pastry
Juice
orange
cranberry
tomato
apple
grape
a) How many different breakfast specials are possible?

3C x 4C x 5C =60 breakfast choices
Answer:
1
1
1
b) How many different breakfast specials without meat are possible?
1C x 4C x 5C =20 meatless breakfast choices
Answer:
1
1
1

In How many ways we can design a Cars Number Plate
if it comprises of 3 Alphabets followed by 3 Numbers?
Answer:
263x103=17576 x 1000=17576000 ways
What if duplication is not allowed in the same case?

26P x 10P = 15600 x 720 = 1123200 ways
Answer:
3
3

In How many ways we can set a Password for our
email address if it comprises of 6 letters?
Answer:
266=
What if it contains 6 letters or numbers or both?
Answer: (26+10)6 =
What if it contains 6 letters and numbers if duplication
is not allowed?
Answer: 36P6 =
A Probability Density Function (PDF)

Its a Mathematical Function which can
generate Probabilities in a Probability
Distribution.
With reference to the previous random
variable examples; we can generate the
same probabilities using a mathematical
function i.e. P(X=x)=nCx/2n.
If we put n=2 and x=0,1,2, we can
observe the same table.
For X=0, we can compute;
P(X=0)=2C0/22=1/4 and so on.
A Probability
Distribution
X P(X=x)=
2C
2
x/2
1/4
2/4
1/4
Sequence of Bernoulli Trials (By James

Bernoulli) results Binomial Random Experiment
In Dichotomous type random experiments, we
always encounter the Bernoulli trials (trials having
two possibilities, i.e. Success or Failure)
If we consider a sequence of n Bernoulli trials in
which we are having x number of successive trials
i.e.; S,S,F,F,F,S,S,F,S,.. F.
So, it must contains x successive trials and n-x
failure trials. Therefore the probabilities of
occurrence of xsuccess in n trials, we got the
following PDF:
P(X=x)=nCx px (1-p)n-x where X=0,1,2,n
Where, n showing number of independent trials and
p is the proportion of success
Binomial Random Experiment

(An Example)
P(X=x)
0.5
0.4
0.3
0.2
0.1
0
0
Suppose in a particular Open heart surgery

operation; chances of survival of patient are 70%
(p=0.7) and if 3 patients are being operated through
the same operation, then chances of survival are
given below;
P(X=0)=0.027,P(X=1)=0.189 , P(X=2)=0.441 and
P(X=3)=0.343, these results indicating higher
chances of survival of any two patients among three
and so on
MS-EXCEL Syntax is, =Binomdist(x,n,p,cumul)
Binomial Probability Distribution

(A Case)
I need to obtain a sample proportion based on a
Hand-Poll consent of the class:
Tell Me how many of you are Pro-Imran Khan as a
Leader of PTI?
Based on a Hand-poll result; we can obtain a
sample proportion .
Now, determine the probability of finding 7 proImran Khan students if we select 10 Business school
students at random? P(X=7) where n=10 and p=
7 x (1 - )3
P(X=7) = 10C7 x
Binomial Probability Distribution

(A Case)
Probability of Finding Atmost 3 pro-Imran Khan
students?
P(X3) = P(X=0)+P(X=1)+P(X=2)+P(X=3)
Which is similar to P(X<4)
Probability of Finding Atleast 7 pro-Imran Khan
Students?
P(X7)=P(X=7)+P(X=8)+P(X=9)+P(X=10)
Which is similar to P(X>6)
Probability of Finding 6 to 8 Pro-Imran Khan students?
(If nothing is written then default is inclusive)
P(6 X 8)=P(X=6)+P(X=7)+P(X=8)
Mathematical Expectation
(A Binomial Distribution case)
As we know the Mean and Variance of any
Probability distribution is:
E(X)= X P(X=x) and V(X)=E(X2) [E(X)]2
But Using Binomial PDF, we can compute both
measures in terms of Parameters:
E(X)=np and V(X)=np(1-p)
Determine the Average number of Pro-Imran
Khan Students in the group of 10.
The Poisson Distribution

Poisson was the French Mathematician.
He Worked on the Binomial PDF and
obtained its LIMITING FORM by putting
n and p0.
Poisson Probability Density function can be
written as: P(X=x)=exp(-) x/x!
Domain for Poisson PDF is 0X

Where is the Parameter.
showing the rate of occurrence or Average.

(Working with PDF)
If the value of Parameter is given i.e. =2, then
find the Probability distribution of X.
X
P(X=x) 0.1353 0.2707 0.2707 0.1804 0.0902 0.0361 0.0120 0.0034 0.0009
Determine the Probabilities of X using Binomial

PDF with n=50 and p=0.04.
Did you find any similarity b/w Binomial and
Poission Probabilities??
Its due to the Asymptotic nature as
E(X)=np=50x0.04=2 which is equals to given .

(Cases)
If a secretary is making 2 mistakes per page,
then determine the probability that she will
make no mistake in the next page?
P(X=0) with =2.
According to the survey, it is found that there
are approx. 4 field mice per acre of land then
determine the probability that on the next
acre of land, there will be atmost 2 field mice
found?
P(X2) with =4.
The Normal (Gaussian) Distribution

(Distribution of a continuous random variable)
Bell-shaped distribution or curve
Perfectly symmetrical about the mean.
Mean = median = mode

Tails are asymptotic: closer and closer to horizontal
axis but never reach it. Approximate domain
formula is -3 X +3
The Normal Probability Density Function

The PDF is written as:
Where and are two parameters which are

Mean and Standard Deviation, respectively.
Simplify the f(X) if =0 and =1?
Simplified form is said to be the Standard Normal
Distribution.
Normal curves and probability
Finding Area Under the Standard Normal

Curve
Determine the following Areas/probabilities using the
Standard Normal Table:
1- P(Z1.25) =
2- P(Z< -1.00) =
3- P(Z= -1.00) =
4- P(Z +1.00)
=
Solution,
P(Z +1.00)
= 1 P(Z< +1.00)
= 1 0.8413 = 0.1587
Theorem: P(Z +1.00) = P(Z -1.00)
Finding Area Under the Standard Normal

Curve
Determine the following Areas/probabilities using the
Standard Normal Table:
5- P(-1.00 Z +1.00) =
Solution,
P(-1.00 Z +1.00) = P(Z +1.00) P(Z < -1.00)
Theorem:
P(a Z b) = P(Z b) P(Z < a)
6- P(-2.25 Z -1.00) =
Observing Quantiles (Inverse

consideration of Standard Normal Table)
Determine the following Quantiles/Percentage
Points/Z-scores using the Standard Normal Table:
7- P(Z a) = 0.025
Z
0.09
0.06
-3.9
..
-1.9
0.025
Therefore, the answer will be a= -1.96
0.00
Observing Quantiles (Inverse

consideration of Standard Normal Table)
8- P(Z b) = 0.05
Z
0.09
0.05
0.04
0.0495
0.0505
0.00
-3.9
..
-1.6
Therefore, b = -[1.6 + (0.04+0.05)/2] = -1.645

Elsewhere we can also consider the nearest value.
Normal Distribution (Cases)

Soft-drink Analysis from KU canteens
Amount of soft-drink within a glass follows a
Normal Distribution with =220 ml. and =5 ml.
If a student purchases one glass of soft-drink then
determine the probability that he will get less than
215 ml within his glass:
P(X<215) = ??
We must use the z-transformation: Z = (X-)/, so:
P[(X-)/ < (215-220)/5]
=
P( Z
<
- 1.00 )
= 0.1587
Normal Distribution (Cases)

Soft-drink Analysis from KU canteens
P(X<215) = 15.87%
1- There is a 16% chance that he will get less than
215ml within his glass.
2- We are 16% confident that he will get less than
215 ml. within his glass.
3- If 50 students purchasing 50 glasses of soft-drink
then approx. 50 x 0.1587 8 of them will be
having less than 215 ml. within their glasses
Find: P( 215 X 225 ) = P(X 225) P(X<215)
Normal Probabilities Using MS-EXCEL

For any Normal distribution with =250 and
=5, we can obtain the P(X<245) using the
following syntax:
=Normdist(x,,,cumulative)
=Normdist(245,250,5,1)
And for P(X>255)
=1 - Normdist(255,250,5,1)
We can apply the same scenario on a soft-drink case study.
Index Numbers
Index Numbers are RELATIVE measures.
Index Numbers Could be Price Relatives or
Quantity Relatives.
Index Numbers are having two major types:
1) Simple Index
2) Composite Index
Simple Index Number can be obtained using
this formula: In=Pn/P0100 where,
Pn is the current year (time) and Po is the Base year (time)
Simple Index (Example)

Consider the following table comprising prices of a
commodity in different years:
Years
Price
(Rs/-)
Fixed Base
In=Pn/54 100
Chain Base
In=Pn/Pn 100
2006
54
=54/54 100= 100.0%
=54/54 100= 100.0%
2007
60
=60/54 100= 111.1%
=60/54 100= 111.1%
2008
67
=67/54 100= 124.1%
=67/60 100= 111.7%
If we want to use a Fixed base method by fixing the base year

as 2006 then the possible Indices will be computed by dividing
all Price values with 54.
In Chain base method; the preceding year price will be used
as base.
Composite Index (Example)

Consider the following table comprising prices of a
commodity in different years for three different cities:
Years
Price
City1
Price
City2
Price
City3
Sum
P
Fixed Base
In=Pn/156 100
2006
54
52
50
156
100%
100%
2007
60
65
62
187
119.9%
119.9%
2008
67
65
68
200
128.2%
106.9%
Chain Base
In=Pn/P0 100
Before computing the fixed base or chain based index

numbers, we have to obtain a sum for all prices in the next
column.
Finally we can compute both Fixed base and chain base
indices for the P column using the same procedures.

Essential Stats For Decision Making-1 Descriptive Stats-2011

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Essential Stats For Decision Making-1 Descriptive Stats-2011

Caricato da

Copyright:

Formati disponibili

In the Name of ALLAH, the beneficent, the Merciful,

Quantitative Methods for

What is Statistics (A science or an art?)

What is DATA (A word or a Keyword?)

Data v/s Variable

Data Tabulation (Grouping Exercise)

Step # 02: Finding the number of classes

Step # 03: Finding the width or height (h)

One of the intervals into which the range of a variable

Data Process Information

Data Process Information

Grouping the data (MSEXCEL)

Data Analysis option is located in the Data menu, in case if it is not

Grouping the data (MSEXCEL) cont

Statistical Measures (An introduction)

Statistical Measures (An Example)

The above data showing Income in 1000s of Rupees of some individuals in

Statistical Measures (Quartiles)

Computing Quartiles (Procedure)

Step-2: Finding the value of ith Quartile:

Where l = lower limit of captured class, h=class-width, f=class

Computing Quartiles (Demo)

Step-2: Q1=4+2/5 (6.25 - 2) = 5.7

Quartiles (Income Classes)

Quartiles can be computed using MSEXCEL, ungroup

Exploratory Data Analysis (EDA) by Sir

Exploratory Analysis for Quality ranks

Statistical Measures (Central Tendency)

(Mean, Median and Mode)

Mean, Median and Mode

The Mode is based on the principal of democracy, while

Mean and Mode (Computations)

Empirical relationship b/w

Mean Median Mode (Thus the data is slightly vely skewed )

Mean, Median and Mode

Statistical Measures (Dispersion)

Dispersion Measures (Cont)

Variance of the following

Coefficient of Variation (Consistency Check)

For any consistent variable, C.V. < 100%

Variance & Standard deviation (group-data)

Variable Comparison (Property of C.V.)

Moments Based Skewness

For symmetric distribution, this measure will be

Moments Based Kurtosis

For Laptokurtic state

Approximate Confidence Interval

Why Bell-Shaped Symmetrical

Empirical Proof for the Approx.

Statistical Process Control (SPC)

By observing any realization; we can monitor any process

Statistical Process Control (An activity)

Logical Thinking motivation

Now, try to re-draw the same fish, but without

Logical Thinking through the Venn

Shading the Venn Diagram

Probability Topics Tree

Mutually Exclusive (Non Overlapping)

Non Mutually Exclusive (Overlapping)

What is the Distribution?

Can also show the

Bell-Shaped Symmetrical Distribution

Any Experiment with uncertain results