Sei sulla pagina 1di 175

# CHAPTER – 1

WHAT IS STATISTICS?

1.0 Objectives
1.1 Introduction
1.2 Statistical Modeling
1.3 Probability
1.4 Common statistical Terminology
1.5 Population
1.6 Probable errors in statistics
1.7 Variables
1.8 Statistical Measures (Tools)
1.8.1 Central Tendency
1.8.2 Measures of Dispersion
1.9 Distribution
1.10 Expectations
1.11 Association
1.12 Summary
1.14 Questions for Self - Study

1.0 Objectives

## After studying the concept of a statistics and its fundamental

operations students can explain the following –
 Concepts of statistics
 Statistical Modelling and statistics for Business decision mak-
ing
 What is probability and its use in Business for making proper
decisions.
 Common statistical terminology and tools. Students can solve
problems of proper decision making in Business involving
numerical data.

What is Statistics? / 1
1.1 Introduction

In our day to day life we deal a lot with statistics, may be without
being aware of it. For example, when you tell your graduation marks –
You are making use of ‘Averaging’ concept of statistics. When you talk of
odds in favour of India’s winning a cricket match – You are dealing with
‘Probability’. When you are talking of most selling product – You speak
of ‘Modal Value’. Weather forecast is also based on statistical analysis
of weather conditions collected with the help of Satellite.

## Statistics is a mathematical science pertaining to the collection,

analysis. Interpretation or explanation, and presentation of data. It is ap-
plicable to a wide variety of academic disciplines, from the physical and
social sciences to the humanities. Statistics is also used for making
informed decisions.

## Statistics can be used in singular and plural sense. When used

in singular sense it describes a discipline, a subject. e.g. BBA semester
– I has one subject as Statistics. When used in plural sense, it denotes
the results obtained from the data. e.g. What are the export statistics?
Here you are interested in statistical data related to exports.

## 1.2 Statistical Modelling and Business Statistics

for Decision Making.

## Statistics can be broadly divided into two categories –

1. Descriptive Statistics
2. Inferential Statistics

## Descriptive statistics describes the state of the affairs. It tells

you what kind of data you are dealing with e.g. what is the average strength
of the cement block. What is the variability of that strength? We know
that all cement blocks that are tested may not give the exactly same
value. We need to know how much the test values differ from each other.
The idea of this variation is given by variability measures. All these mea-
sures are discussed at length in the following chapters. Thus Descriptive
statistics can be used to summarize the data, either numerically or graphi-
cally, to describe the sample. Basic examples of numerical descriptors
which are used to describe the data include various kinds of charts and
graphs.

## Inferential statistics helps us in decision making process. Can

we decide whether the cement blocks have the required strength from
the given test results? Yes or No? Such type statistics is mainly useful in
hypothesis testing. You will study the introductory concepts of hypoth-
esis testing in this book. Thus Inferential Statistics is used to model
patterns in the data, accounting for randomness and drawing inferences
about the larger population. The inferences are drawn through Statistical
modeling. Statistical models give us some sort of relationship between
different variables and observations under study. Modelling can be used
to draw inferences, which may take the form of answers to yes / no
questions (hypothesis testing), estimates of numerical characteristics
(estimation), descriptions of association (correlation), or modelling of re-
lationships (regression). Other modelling techniques include ANOVA
(Analysis of Variance), time series and data analysis.

## Thus Statistical methods that can be used to summarize or de-

scribe a collection of data and in addition, patterns in the data may be
modelled in a way that accounts for randomness and uncertainty in the
observations, and then used to draw inferences about the process or
population being studied. Both descriptive and inferential Statistics com-
prise Business Statistics or Applied Statistics.

## There is also a discipline called mathematical Statistics, which

is concerned with theoretical basis of the subject.

1.3 Probability

## Probability is a theory of chance. It can be broadly classified as

mathematical probability and subjective probability. For example, what
are the chances of India’s winning a cricket match? A person might say
that chances are 50% Is it same as probability? If his decision is based
on past Statistical data on India’s winning and losing records and if the
chances are calculated according to the rules of probability calculations,
then it is mathematical probability of India’s winning the match is 0.5,
otherwise it is subjective probability which based on what one feels!

## Probability of an event, which is outcome of an experiment. Is 1

if the event comprises of all possible outcomes of an experiment. For
example probability that a toss of a fair coin will yield head or tail is 1.

## Probability of an event is 0 if the event does not comprise of any

of the possible outcomes of an experiment. For example probability that
a throw of a fair dice will yield 7 is 0.

## The probability might change with additional information avail-

able. The changed probability is called as posterior probability and the
probability in the absence of such information is called as priori probability.
What is Statistics? / 3
An interesting observation about probability is even if probability
of happening of an event is very low say 0.01, the happening of event
cannot be ruled out! And it is true for every trial. And if that chance of 0.01
turns into reality for majority of trials, the probability would show the
upward change!

 Probability
It is theory of chance when taken as science.
It is chance of happing an event when considered in connection
with event. Probability of any event is between 0 and 1 both
included.
 Mathematical or Objective Probability
Probability theory, which is based on Statistical data and prob-
ability axioms, is called as mathematical probability.
 Axioms of probability
There are three axioms of probability : (1) Chances are always
at least zero. (2) The maximum chance that something hap-
pens is 100%.
 Subjective probability
Probability theory, which is based on feelings of thinking of a
person, is called as subjective probability.
 Conditional probability
It is probability of an event that is calculated on the assumption
that some related has happened.
 Experiment
Action whose outcomes are of interest to us is called as an
experiment e.g. tossing of a coin.
 Event and Happening of an event
Event is a set of one or more outcomes of an experiment. An
event is said to have happened if the outcome is the result of the
experiment. e.g in the experiment of tossing of coin there are
two outcomes head and tail. Two events A and B can be defined
as follows
Event A : Head shows up in the experiment of tossing of a coin.
Event B : Tail shows up in the experiment of tossing of a coin.
Now if head shows up, then we can say that event A has hap-
pened.
Probability of an event A is denote by P (A).

 Sample Space
Set of all possible outcomes of an experiment is called as sample
space.
 Dependent events
If happening of one event changes the probability of another
event then those events are said to dependent events.
 Independent events
If happing of one event does not change the probability of an-
other event then those events are said to be independents events.
 Mutually Exclusive Events
Two or more events which cannot happen at the same point of
time are called as mutually exclusive events.
 Exhaustive Events
If two or more events cover the entire sample space i.e. if two or
more events cover all possible outcomes of an experiment, then
such events are called as exhaustive events.
 Certain Event
If probability of a happening of an event is 1. The event is a
certain event.
 Impossible event
If probability of a happening of an event is 0, the event is an
impossible event.
 Complement of an event
Complement of an event means that event does not happen. i.e.
if event A is getting 1 in throw of dice then A complement is not
getting 1 in a throw of dice. Complement of event is denoted by
(A C , AI or A ) And P ( A C ) = 1 - P ( A ).

1. What is statistics?
________________________________________________________
________________________________________________________
2. What is mean by descriptive statistics?
________________________________________________________
________________________________________________________

What is Statistics? / 5
3. Short Notes:-
i. Probability:
________________________________________________________
________________________________________________________
ii. Descriptive Statistics:
________________________________________________________
________________________________________________________
iii. Inferential Statistics:
________________________________________________________
________________________________________________________

## 1.4 Common Statistical Terminology

 Data or Data Set
Data is a set of measurements of some qualitative aspect or
quantitative aspect. If we record earnings of five persons in a
city, say 100, 200, 200, 500, 1000 (in Rs.) then those figures
will be our data set or data.
 Unit
Unit is an individual about which data is to be collected, e.g. Person.
 Observation
Observation is individual measurement in a data set, e.g.
Rs. 700.
 Quantitative Data
Quantitative Data is data which has numerical value e.g. the
above data. Another example is marks obtained by students in
a class.
 Qualitative Data
Qualitative Data is data which does not have numerical value. It
is data which is of descriptive nature, e.g. colour of eyes.
Table of Dividend paid by 50 Companies
Dividend (%) No of Companies
0–6 8
6 – 12 10
12 – 18 15
18 – 24 12
24 – 30 5
Table 1.4.1
 Frequency
Number of times the observation repeats is called as frequency
of that observation. In our example frequency of class 6-12 is 10
thats means frequency of observations is 10.
 Class
Class is a group of the observations whose value fall in the
specified range e.g. If we make classes of marks obtained by
students as 11-20, 21-30 etc. then number of the students scoring
marks between 11 to 20 will be recorded in class 11-20 and so
on.
 Class Boundary. (Class Limits)
The extreme values which defines that which observation would
included in a class are called as class boundaries.
 Upper class boundary and Lower Class boundary
The uppermost and lowermost values of class are called as
upper class boundary and lower class boundary respectively.
In above example (table 1.4.1) consider the class 12-18,
12 - Lower limit (lower boundary)
18 - Upper limit (Upper boundary)
12% and 18% divided both are included i.e., 15 companies will
get 12 to 18% divided. This method is called inclusive method.
Where lower limit is included and upper limit is not included in
that percolator class.
e.g.

0-10 12

10-20 15

20-30 6

30-40 9
Table 1.4.2

## A Student Pratik-Marks - 30 will count in the class 30-40,

because method is exclusive means lower limit of class 30-40,
30 is included and for previous class 20-30, 30 is excluded.

What is Statistics? / 7
 Class Interval
Difference in Upper class boundary and Lower class boundary
is called as class interval. In above case class interval is
0 - 10 = 10. Example (table 1.4.2).

 Class Marks
Mid point of a class is known as class mark. It is average of
Upper class boundary and Lower class boundary i.e. 25 is class
20  30 50
mark for the class 20-30. E.g. = = 25
2 2
 Class Frequency
Class frequency is the number of observations in the class.

## 1) Midpoint of a class is known as _______________.

2) Difference between upper class boundary an Lower class
boundary is called as _______________
3) The uppermost and lowermost values of class are called as
_______________ and _______________.
4) Num ber of tim es t he observ ation repeats called as
_______________.
5) Class is a group of the observations whose value fall in the
_______________.
6) Colour of eyes is a _______________.
7) Marks obtained by students in B.B.A Class is _______________
variable.
8) Probability is a theory of _______________.
9) ANOVA is _______________.
10) Modeling can be used to draw _______________.

## The entire collection of objects or persons about which infer-

ences are to be drawn is called as population or universe. e.g. If some

conclusions are to be drawn about the students in a particular college, all
students of that college will comprise the population.

 Sample
The part of the population selected for the purpose of the study
is called as sample. In above case it will be difficult to interview
all the students of the college as the total number of students
could be in thousands. In such case one would select a few
students for interview. Students selected for interview comprise
of sample.

 Sample Size
The number of elements in a sample is called as sample size.

 Sample Survey
A survey based on the responses of a sample of individuals,
rather than the entire population.
 Cluster Sample
In a cluster sample, the entire population is divided into hetero-
geneous group and some of such groups are selected as sample
which is chosen on geographical basis is example of cluster
sampling. If the blocks are chosen separately from different
strata, so the overall design is stratified cluster sample.
 Convenience Sample
A sample drawn because of its convenience not a probability
sample e.g. sample of people having telephone numbers in as
Pune city to decide about the population in city is convenience
sample. It is selected because it would be easier to interview
people over phone rather than visiting their homes. Samples of
convenience are not representative of the population, and it is
not possible to quantify how unrepresentative results based on
samples of convenience will be.
 Random Sample
A random sample is a sample whose members are chosen at
random from a given population in such a way that the chance
of obtaining any particular sample can be computed from
particular population.
 Simple Random sample or probability sample
A simple random sample is the sample selected from population
where every individual of the population has equal chance of
What is Statistics? / 9
getting selected. A simple random sample can be drawn in two
ways – SRSWR (Simple Random Sample with Replacement)
and SRSWOR (Simple Random Sample without Replacement),
In SRSWR individual once selected in the sample can be again
selected in another sample i.e. it is put back in the population.
In SRSWOR individual once selected in the sample cannot be
selected in any other sample i.e. it is not put back in the
population. If we want to draw sample of size 2 from the numbers
4.5.6 then with SRSWOR we can have the following samples.
(4,4), (4,5), (4,6), (5,4), (5,5), (5,6), (6,4), (6,5), (6,6). While with
SRSWOR we can have only 3 samples as (4,5), (4,6), (5,6)
Thus if sample size in n i.e. if n units are to be drawn from a
population of N1 units then total number of samples that can be
drawn by SRSWR method is Nn While with SRSWOR method
we can draw NCn samples.

 Stratified Sample
In random sampling, sometimes the sample is drawn separately
from different disjoint homogeneous (having same properties)
subsets of the population which itself is heterogeneous (having
different properties) i.e. population is divided into number of
groups. Each such group is called a stratum. The plural of stratum
is strata. Samples are drawn separately from each of such group.
Sample drawn in such a way are called stratified sample.
For example, to determine buying habits of persons in
society, one needs to divide the populations of the city into various
income groups. Because buying habits would differ according
to the income. Thus heterogeneous population that population
having dissimilar incomes is divided into number of homogeneous
groups or strata having similar incomes.

 Systematic Sample
A systematic sample from a frame of units is one drawn by
listing the units and selecting every individual after fixed interval.
For example, if there are 100 units in the population and a sample
of 10 is to be drawn, then every 10th is selected. It is not
necessarily the first unit, the eleventh unit, the 21st unit….. The
first unit selection is usually made by a random number and
then every 10th unit selected. Systematic samples are not random
samples, but they often behave essentially as if they were
random, if the order in which the units appear in the list is
haphazard. Systematic samples are a special case of cluster
samples. Systematic samples are not as good as simple random
sampling. When starting unit is not selected by random number
method rather than it is decided by the judgement, then such
sample is called as Systematic Sample and not as Systematic
Random sample.

 Quota Sample
Quota sampling is a method of sampling widely used in opinion
polling and market research. Interviewers are each given a quota
of subjects of specified type to attempt to recruit for example,
an interviewer might be told to go out and select 20 adult men
and 20 adult women, 10 teenage girls and 10 teenage boys so
that they could interview them about their television viewing.

## 1.6 Probable Errors in Statistics

 Sampling Error
Sampling error are the errors in the sample selection which can
as random errors, error to due bias or systematic errors.

 Random Error
All measurements are subject to error, which can often be bro-
ken down into two components: a bias or systematic error, which
affects all measurements the same way; and a random error,
which is in general different each time a measurement is made,
and behaves like a number drawn with replacement from a box
of numbered tickets whose average is zero.

 Systematic error
An error that affects all the measurements similarly. For ex-
ample, if a ruler is too short, everything measured with it will
appear to be longer than it really is (ignoring random error). If
you are watching runs fast, every time interval you measure
with it will appear to be longer than it really is (again, ignoring
random error). Systematic errors do not tend to average out.
Systematic errors can also originate from incorrect sampling
procedures.

##  Standard Error (SE)

The standard Error of a random variable is a measure of how far
it is likely to be from its expected value; that is, its scatter in
repeated experiments. The SE of a random variable X is defined

## to be SE(X )  E( X  E(X ))2

What is Statistics? / 11
That is, the standard error is the standard deviation of the
errors.
1.7 Variable or Variate
A letter which can take values of all observations e.g. If variable
x represent marks or three students who have scored 40, 50
and 60 marks, then x 1 = 40, x2 = 50, x3 = 60.
 Categorical Variable
A variable whose value ranges over categories, such as male,
female. Some categorical variables are ordinal.

 Continuous Variable
A quantitative which can take all values in its range is called as
continuous variable. Its set of possible values is infinite set. In
practice, one can never measure a continuous variable to infi-
nite precision, so continuous variables are sometimes approxi-
mated by discrete variables, A random variable X is also called
continuous if its set of possible values is zero A random variable
is continuous if and only if its cumulative probability distribution
function is a continuous function (a function whose graph does
not show any break.)

 Discrete Variable
A quantitative which cannot take all values in its range is called
as discrete variable. Its set of possible values is finite set. A
discrete random variable is one whose set of possible values is
countable. A random variable is discrete if and only if its cumu-
lative probability distribution has breaks in its graph.

 Ordinal Variable
A variable whose possible values can be arranged in some or-
der, such as short, medium, long. In contrast, a variable whose
possible values are India, China, USA, are not ordinal variables,
Arithmetic with the possible values of an ordinal variable does
not necessarily make sense, but it does make sense to say
that one possible value is larger than another.
E.g. 1) 5, 4, 2, 3, then 2, 3, 4, 5 are ordinal sample
2) Good, Better, Best.
 Random Variable
A random variable denotes possible outcomes of a random ex-
periment. E.g. A coin is tossed, we get H and T as random
variable.
 Random Experiment
A random experiment is the one in which all outcomes have

equal chance of appearing. e.g. A throw of fair dice has outcome
1,2,3,4,5,6. Since all outcomes have an equal chance of ap-
pearing, throw of a fair dice is an random experiment and if x
denotes the outcomes 1,2,3,4,5,6 then x is a random variable.
 Bias
When the measurements are affected by the judgment of the
data collector or data analyst rather than by standard Statistical
procedures, bias is said to be introduced. A biased estimate
gives the value, which is different from the truth. Numerical value
of bias is the average difference between the measurement value
and the actual value which could have been obtained without
bias. Unbiased or random selection procedure is without any
bias.

 Dependant Variable
When value of the first variable is governed by the value of the
second variable then first variable is dependent variable. e.g.
x- Rank in examination, y= Number of marks in examination.

 Independent Variable
When value of the variable is not governed by the value of any
other variable then such variable is called as independent vari-
able. e.g. x- Height of a student, y - Marks of student.

value.

##  Measures of Central Tendency

Measures Central Tendency are the numerical values represen-
tative of the data. These are mean, mode and median.
 Arithmetic Mean
It is given by sum of all observations divide by total number of
observations. Consider the observations 10,15,20,30,35.
Arithmetic mean = [10 +15 + 20 + 30 + 35] / 5 = 19
 Geometric Mean
It is given by nth root of product of all observations where n is

What is Statistics? / 13
total number of observations. Geometric mean of 2,2,8,8.
is 4 2.2.8.8 = 4 256 = 4.

 Harmonic Mean
It is the reciprocal of the average of reciprocals of all observa-
tions.
Harmonic mean of 2, 2, 2, 8 is calculated as

13
Step I : [ ½ + ½ + ½ + 1/8 ] =
8
Step II :

n 4 8 32
  4 
1 13 13 13
  8
x
 Median
Observation that occupies the middle place when data is ar-
ranged in increasing order is called as median. Median of
10,15,20,30,35 is 20.
 Mode
Mode is the most frequently occuring observation. Mode can be
more than one. Mode of 100, 200, 200, 500, 600 is 200.

1.8.2 Dispersion

Dispersion gives idea of spread of data from the central value say
mean.

 Deviation
Deviation is the difference between a observation and some ref-
erence value. Observation value is usually represented by X.
Deviation of x from some value A is X – A.
Deviation from mean is X  X .
 Absolute deviation
When deviation is always taken as positive irrespective of its
sign. It is called as absolute deviation. It is represented as
| X  X|.

 Mean Deviation
It is sum of absolute deviations from mean divided by total num-
ber of observations. See chapter 2 for examples.
 Standard Deviation
It is square root of sum squares of mean deviation divided by
total number of observations. See chapter 2 for examples.
 Variance
Variance is square of standard deviation. See chapter 2 for ex-
amples.
 Quartile deviation
It is the difference between the third quartile and first quartile
divided by 2. it is also called as semi-interquartile range. See
chapter 2 for examples.
 Inter Quartile range
It is the difference between the third quartile and first quartile.
See chapter 2 for examples.
 Range
Range is the difference between the largest value and the small-
est value of the data set. See chapter 2 for examples.

1) A quantitative variable which can take all values in its range is
called as _______________.
2) A quantitative variable which cannot take all values in its range
is called as _______________.
3) A biased estimate gives the value which is different from the
_______________.
4) Measures of Central Tendency are _______________.
5) _______________ is the most frequently occurring observation.
6) Variance is square of _______________.
7) _______________ Square root of Variance.
8) _______________ is the difference between the third quartile
and first quartile.
9) _______________ is the difference between the largest value
and the smallest value of the data.
10) Num ber of ti mes experi ment is repeat ed i s cal led
_______________.
What is Statistics? / 15
1.9 Distribution

## Distribution gives idea about how individuals are distributed in

the population. It can be represented by a generalized frequency curve.
The distribution can also be represented by some mathematical relation-
ship known as distribution function.

 Trials
Number of times experiment is repeated is called as
number or trials.
 Binominal Distribution
A random variable has a binomial distribution if it de-
notes number of successes of a particular event in n number of
trials and p is the probability of success in each trial. Probabil-
ity of success remains same for all trials. Binominal distribution
has two parameters (n.p.) it is a discrete distribution e.g. num-
ber of heads obtained in tossing of a fair coin for n times. Vari-
ables representing binomial distribution is a binomial variable or
binomial variate. See chapter 4 for more details.
 Poisson Distribution
A random variable has a poison distribution if it denotes
number of successes of a particular event when x units are
picked up from population and m is the mean value of successes
e.g. Finding probability that sample of 10 units would contain 2
defectives if probability of finding defective is .05. Poisson distri-
bution has only one parameter m. It is a discrete distribution.
Poisson distribution is usually applied where probability of suc-
cess is quite low e.g number of accidents, number of defective
products etc. Variable representing Poisson distribution is a
Poisson variable or Poisson variate. See chapter 4 of more de-
tails.
 Normal distribution
A random variable is normally distributed if the variable
is continuous and the distribution is symmetric about mean.
50% observations lie below the mean and 50% observation lie
above the mean. It has bell shaped continuous curve in which
two parts are made by the vertical line at mean exactly fit over
each other.
In this distribution mean = mode = median Variable rep-
resenting normal distribution is a normal variable or normal vari-
ate. See chapter 4 for more details.

 Standard Normal Distribution
It is a normal distribution in which mean = mode = median = 0
and standard deviation = 1. Variable representing standard nor-
mal distribution is a standard normal variable or standard nor-
mal variate. Standard normal variate is denoted by z.
 Univariate Distribution
Distribution involving only one variable is called as univariate
distribution e.g average marks obtained by students in a ex-
amination.
 Bivariate distribution
Distribution involving two variables is called as bivariate distribu-
tion e.g. Marks obtained by students in two subjects say eco-
nomics and statistics in an examination.
 Skewed distribution
A distribution that is not symmetrical is skewed distribution.

## For discrete distribution expected value is weighted means of all

outcomes where weights are probability of the outcome.
For continuous distribution expected value the mean value.
If X and Y are two random variables, the expected value of their
sum is the sum of their expected values (E(X  Y )  E( X )  E( Y)) .
And the expected value of a constant a times a random variable
X is the constant times the expected value of E ( a x ) = a E ( x ).
 Hypothesis
An Assumption of outcome of Statistical testing is called as
hypothesis. e.g. Sample is as per required norms according to
the given parameter.
 Parameter
Parameter is criterion on which sample is accepted or rejected.
Parameter could be mean, standard deviation etc.
 Estimator
Estimator is parameter which is used to estimate the value of
the population parameter. An example of an estimator is the
sample mean. Which is an estimator of the population mean.
 Test Statistics
Value of the test parameter is known as test Statistics.

What is Statistics? / 17
 Null Hypothesis
It is initial assumption about an outcome before testing. It is
denoted by Ho e.g. Average strength is as per required norms,
which means population mean is same as sample mean. It is
written as Ho:   X .
 Alternative Hypothesis
Another assumption if null hypothesis is proved to be false. It is
denoted by H1 e.g. Average strength is less than required norms.
 Confidence Interval
A confidence interval is percentage of observations that are sup-
posed to lie in that interval. e.g. 95% confidence interval is sup-
posed to contain 95% of observations according to the speci-
fied criteria.
 Confidence Level
Confidence level is the confidence interval in which we expect to
lie the given parameter of the hypothesis.
e.g. A hypothesis is rejected at 95% confidence level means
that the given set of observations does not match with 95% of
the population for the given parameter.
 Significance Level, Critical Level
Significance level is the percentage of observations which lie
beyond the desired confidence level.
e.g. 95% confidence level means 5% significance level.
 Critical Value
The critical value in a hypothesis test is the value of the param-
eter beyond which we would reject the null hypothesis.
 Type I Error
Rejecting the null hypothesis when it is true.
 Type II Error
Accepting the null hypothesis when it is false.
 One sided tests or one tailed tests :
A test in which we consider only one side of the distribution
e.g. greater than and less than testing.
 Two sided tests or two tailed tests :
A test in which we consider only both sides of the distribution
e.g. equal to and not equal to testing.

1.11 Association
Two variables are associated if variation in one variable has effect
on variation in other variables.
 Correlation
It is a measure of association between variables.
 Scatter Diagram or Scatter Plot
It is graph obtained by plotting of values of two variables which
describe single bivarite observation (e.g. height and weight of a
persons). One variable (independent Variable) X coordinate and
the other variable (dependant variable) as Y coordinate.
 Correlation coefficient
The correlation coefficient r is a measure of how nearly a scat-
tered diagram or scatter plot falls on straight line. The correla-
tion Coefficient is always between – 1 and + 1.
 Causation, Causal Relation.
Two variables are casually related if changes in the value
of one cause the other to change.

## 1) _______________ is a measures of association between vari-

ables.
2) The correlation Coefficient is always between _______________.
3) _______________ is Rejection the null hypothesis when it is
false.
4) _______________ is accepting the null hypothesis when it is
false.
5) Two variables are _______________ if variation in one variable
has effect on variation in other variable.

1.12 Summary
This chapter explains in detail the importance of statistics to
people and scope of statistics in different fields like Medical science
Business etc. The different types of statistical theory and tools to suse
and get proper decision of our interest in the Business.

What is Statistics? / 19

1.3

1.4

1) Class marks
2) Class interval
3) Upper class boundary and lower class
4) Frequency of that observation
5) Specified range
6) Quantitative Data
7) Quantitative Data
8) Chance
9) Analysis of Variance
10) Inferences

1.8

1) Continuous Variable
2) Discrete Variable
3) Truth
4) Mean, Mode and Median
5) Mode
6) Standard Deviation
7) Standard Deviation
8) Inter Quartile Range
9) Range
10) Trials

1.12

1) Correlation
2) – 1 and + 1

3) Type I Error
4) Type II Error
5) Association

## 1) Write the difference between Descriptive Statistics and Infer ential

Statistics.
2) Write a short note on, ‘Probability’.
3) What is Sampling?
4) Write types of Sampling.
5) Explain the term ‘Statified Sampling’.
6) What are Central Tendency and write proper formulae.

  

What is Statistics? / 21
NOTES

CHAPTER 2

MEASURES OF CENTRAL
TENDENCY AND DISPERSION
2.0 Objectives
2.1 Introduction
2.2 Methods of Collection of Primary Data
2.2.1 Direct personal interview
2.2.2 Indirect personal interview
2.2.3 Mailed questionnaire
2.2.4 Scheduled through enumerations
2.3 Organizing the Data
2.3.1 Cumulative Frequency Distribution
2.3.2 Grouped Frequency Distribution
2.3.3 Guidelines for making class intervals
2.3.4 Cumulative grouped frequency distribution
2.4 Graphical Representation
2.5 Pie Chart Calculations
2.6 Frequency Curves
2.7 Cumulative Frequency
2.8 Averages
2.9 Partition Values
2.9.1 Quartiles
2.9.2 Deciles
2.9.3 Percentiles
2.10 Measures of Dispersions
2.10.1 Mean Deviations
2.10.2 Standard Deviation
2.11 The Coefficient of Variation
2.12 Skewness
2.13 Quartiles and the quartile Deviation
2.14 Extreme Values
2.15 Summary
2.17 Questions for Self - Study

2.0 Objectives

## After studying the concept of Measures of Central Tendency

and Dispersion Students can explain the followings –
 Methods of collections of Primary Data.
 How to divide and Organize Data.
 Types of charts and graphs.
 Draw Graphs.
 Limitations and advantages of each and every method.
 Calculations.
 Making decision on the basis of statistical tools in the business
implementations.

2.1 Introduction

## In this chapter we will study various type of data, methods of

data collection, data representation and measures of central tendency.

## First question is ‘What is data’? Data is a set of something that

we want to know about the object or objects under study. For example if
we want to study performance of students in an examination, set of marks
obtained by students will be our data.

## Data can be gathered by different methods which will study in

detail in this chapter. After collection of data we need to classify the data
to know as in above case, the performance of the students and analyze
that data to arrive at conclusions and finally we need to take presentation
of our slides to the concerned authorities e.g: teacher. Data representa-
tion usually precedes data analysis as we come to know various trends
within the data through data representation.

## In mathematics, an average, or central tendency of a data set

refers to a measure of the middle or expected value of the data set. There
are many different types of averages that can be chosen as a measure-
ment of central tendency of the data items. The most common method is
the arithmetic mean, geometric mean, harmonic mean etc. are also use-
ful in applicable situations.

## Dispersion on the other hand is about, as in above case, whether

majority of students have scored marks near the average or far away from
the average, For example average of 1 and -1 is also 0. Similarly average
of 100 and -100 is also 0. In the first case data values are near the aver-

age i.e dispersion is less while in the second case data values are more
scattered or spreads or away from the average i.e dispersion is more.
The most common measure of dispersion is the standard deviation.

## 2.2 Methods of Collection of Primary Data*

Primary data is data collected for the first time through census
or sample. There are several ways of collecting such data. These are :

##  Direct personal interview or observation.

 Indirect personal interview or observation.
 Mailed questionnaire
 Scheduled through enumerators.

## In the direct personal interview, the investigator collects the infor-

mation directly from the sources concerned. For example, an investiga-
tor may collect information about cost of cultivation through personal in-
teraction with the farmers who cultivate the land.

## - Information so collected is more accurate, reliable and useful.

The investigator can check and countercheck the information
and get the form in which he desires.
- The investigator can put alternative questions suited to the edu-
cational and cultural level of the persons concerned.
- In such cases, information can be collected by eliminating the
bias and prejudices of the persons concerned.

 Limitations:

## - Such a method can be adopted only when the enquiry is inten-

sive and localized to a locality or a group. This cannot be used
when the enquiry is extensive or is to be done in large areas.
- Such an enquiry is subjective in the sense that the intelligence,
tact, skill as well as personal bias of the investigator are all
reflected in the process.

## Measures of Central Tendency & Dispersion / 25

2.2.2 Indirect Personal Interview

## The indirect personal investigation is through some agencies that

have intimate knowledge of the phenomenon under enquiry. For example,
an investigator may collect information about cost of cultivation indirectly

## - It is less time consuming and expensive.

- As information can be collected from more knowledgeable per-
sons, these are expected to be more useful and reliable.
- As fewer persons need be contacted, the enquiry could be more
extensive than in case of direct personal enquiry.

 Limitations:

## - The information collected is subjective and is subject to the

personal bias of the persons from whom it is collected.
- One has to be very careful about the selection of such persons
not only their knowledge but their personal attributes affects the
quality of data. Great caution is called for in dealing with such a
situation.

## Questionnaires are usually administered on paper, in a struc-

tured or semi-structured format. Respondents often choose from among
a set of forced-choice, or provided responses. These can include yes/no
or scaled responses. Questionnaires can be administered in person, by
mail, over the phone, or via email/Internet.

## The questionnaires in the form of a set of questions is sent by

mail to the persons from whom information is to be collected. They, in
their turn, are expected to answer the questions and also to supply addi-
tional information and comments, where, necessary and mail them back
to the investigator. Great care is to be taken while preparing a question-
naire. Skill of the experience under enquiry is needed in drafting a ques-
tionnaire. Though there are no hard and fast rules for designing a ques-
tionnaire, there are a few general points which should be borne in mind.

## · The questions put should be clear, concise and unambiguous.

· Delicate questions are to be put with great care. Often indirect
questions should be put to get answers to some pertinent point.
It is sometimes desirable to avoid very delicate questions.
· The size of the questionnaire schedule should be as small as
possible. It saves time, both for the enumerator and the respon-
dent. A large questionnaire is likely to exhaust the patience of
the respondent.
· There should be a natural, logical order in which questions are
arranged.
It should be noted that the information collected through ques-
tions should be such that it is usable.

- It can be administered to large groups of individuals.
- It is much less time consuming and is economical.
- A much larger coverage can be made as people in distant places
can be reached without much difficulty.
- It is advantageous in a situation where the persons concerned
move to far away places. For example, in an enquiry relating to
old students of a college, such a method may be useful as
students move out and away after leaving the institution.
- Useful for collection of demographic information, satisfaction lev-
els and opinions of the program.

 Limitations:
- The method can be adopted only in case of enlightened and
educated people.
- As persons are not approached directly, the proportion of non
response is usually much larger. People do not have the time to
spare nor are they are willing to take the trouble of writing the
answers and returning the questionnaire. Sometimes people also
do not like to record information in their own handwriting and
very often avoid answering delicate questions.

## In census and large scale surveys, enumerators (persons who

collect data are called as enumerators) are engaged to collect informa-
tion from the persons concerned. They gather information in schedules
or questionnaire specially prepared for the purpose in the form of an-
swers given by the respondents to specific questions.

## Measures of Central Tendency & Dispersion / 27

In the case of a census, enumerators visit every member of the
source in the zones or areas specifically allotted to them and in the case
of sample survey, they visit those members who come under their sample
procedure. This method is applied in census and in the most other exten-
sive enquiries designed to cover larger areas for population.

## - This method is the only one possible in case of extensive cen-

suses as well as sample enquiries.
- There is a much lesser degree of subjectivity on the part of
interviewer in this method.
- This method is useful where the scope and coverage is large
enough.

 Limitations:

## - It is expensive and time consuming

- Thorough training of the enumerators is needed before they are
set to the field. It also needs an organization to handle the whole
process of appointment, training and supervision of enumera-
tion work.

## 2.3 Organizing the Data

- Frequency Distribution

## The most important method of organizing and summarizing sta-

tistical data is by constructing a distribution table. In this method, classi-
fication is done according to quantitative magnitude. The items are clas-
sified in to groups of classes according to their increasing order in terms
of magnitude and the number of items failing in-to each group is deter-
mined and indicated.

Consider the following set of data which are the high tempera-
tures recorded for 30 consecutive days. We wish the summarize this
data by creating a frequency distribution of the temperatures.

Frequency Distribution

50 45 49 50 43
49 50 49 45 49
47 47 44 51 51
44 47 46 50 44
51 49 43 43 49
45 46 45 51 46

## To create a frequency distribution from this data we proceed as

follows:

1. Identify the highest and lowest values in the data set. In the
given data temperature the highest temperature is 51 and the
lowest temperature is 43.
2. Create column with the title of the variable we are using. In this
case temperature. Enter the highest score at the top, and in-
clude all values within the range from the highest score to the
lowest score.
3. Create a tally column to keep track of the scores as you enter
them into the frequency distribution. Once the frequency distri-
bution is completed you can omit this column. Most printed
frequency distributions do not retain the tally column in their
final form.
4. Create a frequency column, with the frequency of each value,
as shown in the tally column recorded.
5. At the bottom of the frequency column record the total frequency
for the distribution proceeded by N.
6. Enter the name of the frequency distribution at the top of the
table.

## If we applied these steps to the temperature data we have the

following frequency distribution.

## Measures of Central Tendency & Dispersion / 29

Disc rete Fr equency Distr ibuti on for High
Te m peratur es
Tem per ature Tally Frequenc y
51 4
50 4
49 6
48 0
47 3
46 3
45 4
44 3
43 3
N= 30

## A cumulative frequency distribution can be created from a fre-

Frequency”. For each score value, the cumulative frequency for that score
value is the frequency up to and including the frequency for that value. In
the cumulative frequency distribution for the high temperatures data be-
low, notice that the cumulative frequency for the lowest temperature (43)
is 3, and that the cumulative frequency for the temperature 44 is 3+3 or 6.
The cumulative frequency for given value can also be obtained by adding
the frequency for the value to the cumulative value for the value below the
given value. For example the cumulative frequency for45 is 10 which is
the cumulative frequency for 44 (6) plus the frequency for 45 (4) finally,
notice that the cumulative frequency for the highest value (51) in the cur-
rent case should be the same as the total of the frequency column (30) in
the case of the temperature data).

C u m u lative Fr eq u en cy Distr ib utio n fo r High
T e m p eratu re s
T em pe ratu re T ally F req uen c y Cu m u lativ e
F req u e nc y
51 4 30
50 4 26
49 6 22
48 0 16
47 3 16
46 3 13
45 4 10
44 3 6
43 3 3
N= 30

## In summary then, to create a cumulative frequency distribution:

1. Create a frequency distribution.
2. Add a column entitled cumulative Frequency.
3. The cumulative frequency for each score is the frequency up to
and including the frequency for that score.
4. The highest cumulative frequency should equal N (the total of
the frequency column)

## In some cases it is necessary to group the values of the data to

summarize the data properly. For example, you wish to create a fre-
quency distribution for the IQ scores of your class of 30 pupils. The IQ
scores in your class range from 73 to 139. To include these scores in a
frequency distribution you would need 67 different score values (139 down
to 73). This would not summarize the data very much. To solve this prob-
lem we would group scores together and create a grouped frequency
distribution.
If your data has more than 20 score values, you should create a
grouped frequency distribution by grouping score values together into
class intervals. To create a ground frequency distribution:
1. Select an interval size so that you have 7-20 class intervals.
2. Create a class interval column and list each of the class inter-
vals.
Measures of Central Tendency & Dispersion / 31
3. Each interval must be the same size , they must not overlap ,
there may be no gaps within the range of class intervals
4. Create a tally column (optional)
5. Create a midpoint column for interval midpoints
6. Create a frequency column
7. Enter N = sum value at the bottom of the frequency column.

## 1. There should be approximately 7 to 20 mutually exclusive class

intervals.” Mutually exclusive” means that a score can belong to
only one class intervals. Two non – mutually exclusive class
intervals would be 45-49 and 47-51 since the scores 47, 48, and
49 could belong to either class interval. In a grouped frequency
distribution, the class intervals must be mutually exclusive.
2. Do not omit any class intervals. Just as with regular frequency
distribution, all possible scores between the largest score and
the smallest score of the data set must be included in the grouped
frequency distribution. Even if an interval has a frequency of
zero, it is to be included in the list of class intervals.
3. The class interval size is usually more than 3. The class interval
size is defined as the upper real limit of the class interval, minus
the lower limit of the class interval For instance, in the class
interval 44-49, the class interval size is 49.5 – 44.5 = 5.
4. Pick the smallest class interval size of 3 or 5 or a multiple of 5
that also satisfies the first guideline of producing approximately
7 to 20 class intervals. In other words if a class interval size of
25 will produce approximately 18 class intervals and a class
interval size of 30 will produce approximately 12 class intervals,
select 25 as class interval size. The rationale for this rule is that
it is better to under summarize the data, by using a smaller
class interval size, than to over-summarize the data, would re-
sult with a larger class interval size.
5. The class interval size should be equal for all class intervals. If
the class interval size were not equal for all class intervals, then
we could not perform the statistical computations which use
grouped frequency distributions.
6. The lower apparent limit of each class interval should be a mul-
tiple of the class interval size. If the lowest score in the data set
is 46 and a class interval size of 5 has been selected, the first
class interval would be 45-49 because 45 (the lower apparent
limit) is a multiple of 5 while 46 is not.
Look at the following data of temperatures for 50 days. The high-
est temperature is 59 and the lowest temperature is 39. If we were to
create a simple frequency distribution of this data we would have 21
temperature values. This is greater than 20 values so we should create a
grouped frequency distribution.

D a ta Se t – H ig h T em p e ratu r es f o r 5 0 d a y s
57 39 52 52 43
50 53 42 58 55
58 50 53 50 49
45 49 51 44 54
49 57 55 59 45
50 45 51 54 58
53 49 52 51 41
52 40 44 49 45
43 47 47 43 51
55 55 46 54 41

## If we use this data and follow the suggestions for creation of a

grouped frequency distribution, we would create the following grouped
frequency distribution. Note that we use an interval size of three so that
each class interval includes three score values. Also note that we have
included an interval midpoint column, this is the midvalue of each inter-
val.

## Grouped Frequency Distribution for High Temperatures

Class Interval Tally Interval Midpoint Frequency
57-59 58 6
54-56 55 7
51-53 52 11
48-50 49 9
45-47 46 7
42-44 43 6
39-41 40 4
N= N = 50

## Measures of Central Tendency & Dispersion / 33

2.3.4 Cumulative grouped frequency distribution

## It is a simple matter to create a cumulative grouped frequency

distribution. We just add a cumulative frequency column to the grouped
frequency distribution and we have a cumulative grouped frequency dis-
tribution. The cumulative grouped frequency distribution below was cre-
ated by adding a cumulative frequency column.

## Cumulative Grouped Frequency Distribution for High Temperatures

Class Interval Tally Interval Frequency Cumulative
Midpoint Frequency
57-59 58 6 50
54-56 55 7 44
51-53 52 11 37
48-50 49 9 26
45-47 46 7 17
42-44 43 6 10
39-41 40 4 4
N= N
50 = 50

## Components of Graph: The following figures explain the main

components of graph created by computer software

2. Title :
It can contain the title and subtitle if any of the graph.
3. Axis :
Base line when data is positioned on a graph. Scale and scale
label are displayed on the axis. Unit label, axis title, and break
line are also displayed if necessary. The name of each axis
may vary depending on the chart type.
4. Plot area
The area in which the graph is plotted.
5. Series
The group of series of associated values displayed in the graph
e.g. One year will represented by one series and each series is
represented by a bar in the graph.
6. Legend
The list indicating the colour, line style, or filling pattern of the

## Measures of Central Tendency & Dispersion / 35

graph corresponding the series Legend is displayed in the initial
state other than stock chart.
7. Comment
8. Data label
The string that displays the name or value for the data on the
graph.
9. Text label
The string that can be displayed at any position within the graph.
10. Bar Graph
Ex 1.
Table showing production of wheat rice and cereals for the years
1990 and 1999 is given below.

W heat R ic e C e re a ls T o ta l
1990 50 100 15 0 300
1999 100 150 25 0 500

Bar Chart

## Cereals Year 1990

250
Rice

200 Wheat

150

100

50

Multiple or compound bar chart

250 Ce re als
Rice

2 00 W h eat

1 50

100

50

1 990 1 999

1990 1999

## Wheat Rice Cereals

1990 (50/300)x100~17 (100/300)x100~33 (150/300)x100 = 50
1999 (100/500)x100 = 20 (150/500)x100 = 30 (250/500)x100 = 50

## Measures of Central Tendency & Dispersion / 37

2.5 Pie Chart Calculations

## W heat Rice C ereals

1990 (50/300)x360 = 60 (100/300 )x360 = 1 20 (150/300)x360 = 180
1999 (100/500)x360 = 72 (150/500 )x360 = 1 08 (250/500)x360 = 180

PIE CHART

## Wheat Cereals Rice

A histogram of a frequency distribution is drawn as fol-
lows:
a) The class boundaries are marked on the X-axis starting and
finishing at convenient points on the axis, the class intervals are
thus marked on the X-axis and are taken as bases.
b) On each base, a rectangle is drawn whose height is equal to
the frequency of that class. If the class intervals are of equal
size of width, the areas of the rectangles are proportional to the
corresponding class frequencies. Here the vertical axis (or y-
axis, as is commonly known) is the frequency axis.
c) Instead of class boundaries class limits may be used if the
frequency distribution is given or constructed in terms of class
limit. But it is better to use class boundaries, especially in case
of continuous variables. We draw below the histogram corre-
sponding to the frequency distribution given by Table in the given
example in problem to be solve.

## Consider the distribution of continuous data. If there are number

of observations and if the class intervals are taken to be smaller, it may
be possible to have a sizable frequency for most of the classes. Then
the frequency polygon will closely approximate a curve, which is called
frequency curve. Such a curve is also known as smoothed frequency
polygon.

## Measures of Central Tendency & Dispersion / 39

It has been found that frequency curves of data found in nature
and industry generally take the characteristic shapes as indicated.

## Bell Shaped Skewed to Skewed to the

the right left

J - Shaped Reversed
Revered J Shaped U - Shaped

## Consider the number of observations which are less than the

upper class boundary of a given class interval: this number is the fre-
quencies up to and including that class to which the boundary corre-
sponds. This sum in Known as the cumulative frequency up to and in-
cluding that class interval.

## The cumulative frequency distribution is represented by joining

the points obtained by plotting the cumulative frequencies along the ver-
tical axis and the corresponding upper class boundaries along the x-

axis. The corresponding polygon is known as cumulative frequency poly-
gon (less than) or ogive. By joining the points by a free hand curve we get
the cumulative frequency curve (“less than”). Similarly we can construct
another cumulative frequency distribution (“more than” type) by consider-
ing the sum of frequencies greater than the lower class boundaries of the
classes. For example, the total frequency greater than the lower class
boundary 158.5 of the class 159-160 is one (1), while the total frequency
grater than the lower class boundary 156.5 of the class 157-158 is 1 + 4
= 5, that of the class 155-156 is 1 + 4 + 6 = 11, and so on. Given below
is Table 3.7 of cumulative frequency distribution. (“more than”) of the,
same distribution.

## 2.7.1 Cumulative Frequency (less than) table of heights

of 50 students.

Class
Class (in cms.)(in
interval interval
cms.) Frequency Cumulative Frequency
(Less than)
144.5 – 146.5 2 2
146.5 – 148.5 5 7
148.5 – 150.5 8 15
150.5 – 152.5 15 30
152.5 – 154.5 9 39
154.5 – 156.5 6 45
156.5 – 158.5 4 49
158.5 – 160.5 1 50
Total 50

## Cumulative frequency curve (more than and less than type) of

heights of 50 students.

## Measures of Central Tendency & Dispersion / 41

More Than O give
Plot more than CF
against LCL

50

## More Than O give

Plot Less than CF
against UCL
40

30

20

10

## Cumulative frequency curve (more than) of heights of 50 students.

Classinterval
Class (in cms.)(in
interval
cms.) Frequency Cumulative Frequency
(More than)
145 – 146 2 50
147 – 148 5 48
149 – 150 8 43
151 – 152 15 35
153 – 154 9 20
155 – 156 6 11
157 – 158 4 5
159 – 160 1 1
50

The graph obtained by joining the points obtained by plotting the
cumulative frequencies (“more than”) along the vertical axis and the cor-
responding lower class boundaries along the X-axis is known as cumula-
tive frequency polygon (greater than) or ogive, by joining the points by a
free hand curve, one gets cumulative frequency (“more than” type). These
two curves are shown in figure above.

## An average is a representative value of a list, If all the numbers in the list

were the same, then this number should be used. e.g if the numbers are
same say 4,4,4,4 then we can use number 4 to represent this data set.
What if they are not the same? Which of the number we should select to
represent the data set? The average is the answer. The average should
not depend on the order of the numbers in the list, and it is not less than
the smaller number in the list, nor greater than the greater number in the
list.

## The most common type of average is the arithmetic mean, often

simply called the mean. The arithmetic mean of two numbers, such as 2
and 8, is obtained by A = (2 + 8) / 2 = 5.

## Switching the order of 2 and 8 to read 8 and 2 does not change

the resulting value obtained for A. The mean 5 is not less than the mini-
mum 2 nor greater than the maximum 8. The mean of a list of integers is
not necessarily an integer.

bers we multiply them. Thus, the geometric mean of 2 and 8 is obtained
by G = square root of (2 x 8) = 4. And again it is seen that changing the
order of the members of the list to be averaged does not change the
result: In order to make sense of the requirement that the mean must be
at least as big as the smallest member of the list and no bigger than the
largest, the geometric mean is usually only applied to lists of positive
numbers, not to lists that can include negative numbers such as tem-
peratures.

## It should now be obvious that it would be easy to come up with

many other ways of combining the elements of a list in a manner that
does not change when the order of the list is changed. For each of them
one can define an average based on that method.

## The most frequently occurring number in a list of numbers is

called the mode. So the mode of the list (1, 2, 2, 3, 3, 3, 4) is 3. The
mode is not necessarily well defined. The list (1, 2, 2, 3, 3, 5) has the two
Measures of Central Tendency & Dispersion / 43
modes 2 and 3. The mode can be subsumed under the general method of
defining averages by understanding it as taking the list and setting each
member of the list equal to the most common value in the list if there is a
most common value. This list is then equated to the resulting list with all
values replaced by the same value. Since they are already all the same,
this does not require any change.

## Another average worth discussing is the median. Its method is

to order the list according to its magnitude and then repeatedly remove
the pair consisting of the highest and lowest value till either one or two
values are left. If two values are left replace them with their arithmetic
mean. This method takes the list 1, 7, 3, 13 and orders it to read 1, 3, 7,
13. Then the 1 and 13 are removed to obtain the list 3, 7. Since there are
two elements in this list replace them by their arithmetic mean (3 + 7)/2
= 5. Now do the same for the equal sized list consisting of all the same
value M: M, M, M, M. It is already ordered. We remove the two end
values to get M, M. We take their arithmetic mean to get M. Finally, set
this result equal to our previous result to get M = 5.

## All averages can be thought of as examples of this general method

for obtaining averages. A number of averages, including the ones dis-
cussed above., that have been found to be useful in some circumstances
or other are listed below along with their formal solutions.

## A good measure of average should be simple to understand, easy

to calculate, based on each and every observation, stable to sampling
fluctuations, rigidly defined, should not be affected by extreme values.

## Median The observation value at the middle position when

data is arranged in the increasing order.

## Mode The most repeated value or the value having high

est frequency in the data set
1
Geometric mean X1 X2  X3... 
 Xn n

 1   1 1 1 
Harmonic mean n    n    ... .  
  x n x
 1 x 2 x n 

 Mean for grouped data

## Ex. In a school 35 boys and 85 girls approved in a public school

examination. The mean marks of boys were found to be 60%,
where as the mean of girls was 40% Determine average marks
percentage of the school.

Solution
Total score of 35 boys = 35 x 60 = 2100
Total score of 85 girls = 85 x 40 = 5400
Total score of 120 students = 2100 + 3400 = 5500 marks

## Average marks percentage of the school = 5500/120 = 45.8%

The mean for grouped data is
X = x1f1  x 2f 2  x 3f3  ............xk fk
n
1 k
=  xi fi
n i 1
OR

 k 
  xi f i
i 1 
 k 
  f i 
 i 1 
 Median

Median.

Example: -

## The median of the set 2, 3, 5, 6, 7 is 5. Middle value is 5 & that

of set -3, -1, 0, 1, 2, 3 is (0 + 1)/2 = 0.5.
The median is a value which divides the set of observation into
two equal values such as 50% of the observation lie below the median
and 50% above the median.
The median is not attached by the actual values of the observa-
tion but rather on their positions. Even median is not always a value from
the given set also.

## Measures of Central Tendency & Dispersion / 45

 Procedure to find median
(Frequency Distribution is given)
Step 1: Arrange the values of variables in ascending or descending
order of magnitude.
Step 2 : Find the cumulative frequency ( c.f.)

## Step 3 : Find N/2, where N =  fi

Step 4 : Find the cumulative frequency (c.f.) first greater than N/2
and determine the corresponding value of the variable.
Step 5 : The value obtained in step 4 is the required median.

Median:

Ex: 2,3,5,6,7
No. of observations in the given set = N = 5 It is odd No.
th
 N 1
\ Median = value of   observation
 2 
Here N = 5
N 1 5 1
= =3
2 2
 1st, 2nd, 3rd, 4th, 5th
2, 3, 5, 6, 7
 Median = 3 rd
observation
=5

## Solution : - No. of observations in the given set

= N

= 6 it is even number

 N th N th 
      1 
 Median = value of   2   
   2   Observation
 

  6  th  6 th 
value of      1   observatio n
=  2  2  
 
2

=  
value of 3 rd  4 th observatio n
2

0 1 1
=  = 0.5
2 2

## Ex : Calculate the median for the following data

N o . of stu d en ts 6 4 16 7 8 2

M a rk s 20 9 25 50 40 80

## Solution: Arrange in ascending order and prepare the following fre-

quency distribution table –
N = No. of observations = 43 odd

9 4 4
20 6 10
25 16 26
40 8 34
50 7 41
80 2 43
To tal 43

n
N   fi  43
i1

## N 21  = 4321  = 44 = 22.

2
Therefore, median = 22nd value.

## Measures of Central Tendency & Dispersion / 47

th
 N 1 
Median = value of   observations
 2 
The above table shows that all items from 11 to 26 have their
values, since 22nd item falls in this interval; hence its value is 25.
Hence median = 25 marks.
Ex : 3,4,4,5,7,8,8,8,9,10

Solution:
In given data N = 10.

th th
N N 
(value of    1 Obs.)
 Median = 2 2 
2

th th
 10   10 
(value of     1 Obs.)
= 2 2 
2

th th
(value of 5   6  Obs.)
=
2

78
=
2
= 7.5

 Mode

## The values with highest frequency or most popular observation is

called mode.
For example, in the series 6,5,3,4,7,8,5,9,5,4. We notice 5 oc-
curs most frequently . Hence 5 is the mode.

## Ex. : Calculate Mean, Mode, Median for the following data.

Marks 0 1 2 3 4 5 6 7 8

Number of boys 7 10 16 17 26 31 11 2 1

Solution :

xi f
Fi fi xi c.f.

0 7 0 7

1 10 10 17

2 16 32 33

3 17 51 50

4 26 104 76

5 31 155 107

6 11 66 118

7 2 14 120

8 1 8 121

## T otal 121 440

 Mean  Median
 fi xi th
X =  N 1
 fi M =   observation
 2 
440
= th
121  121  1 
=   observation
= 3.64  2 
 Mode th
Highest frequency of gives data = 31  122 
=   observation
& Value with highest frequency = 5  2 
 Mode = 5 = 61 th observation
e.g ;
(I) 1,2,2,3,3,3,4 = 4
The numbers appear in this data Corrosponding c.f. = 76
1 once 2 twice  Corrosponding Xi = 4
3 trice 4 once

## \ 3 – Mode of the given data

(II) 1,2,2,3,3,5
The data points appear in this data
1 once 2 twice
3 twice 5 once
 3 and 2 the both are mode
Measures of Central Tendency & Dispersion / 49
2.9 Partition Values

There are the values that divide total observations into a number
of equal parts when data is arranged in the increasing order.

2.9.1 Quartiles

## Quartiles divide the total observations in to 4 equal parts. There

are total 3 quartiles. Q quartile has (25)% values below it. e.g. if Q1 of a
data set is 40. It would mean 25% (25*1=25) of the observations of that
data set are below 40. and 75% (100-25=75) observations have value
more than 40.

 
 N  C.F. 
 4 h
Q1 = L +  F 
 
L= Lower limit of the quartile class
CF = Cumulative frequency (c.f.) of previous class
F= Frequency of the quartile class
h= Width of the quartile class

  4

 3  N  C.F. 
 h
Q3 = L +  F 
 
Q1 is called as lower quartile or first quartile
Q2 is called as middle quartile or second quartile or median.
Q3 is called as upper quartile or third quartile.

2.9.2 Deciles

## Deciles divide the total observations into 10 equal parts. There

are total 9 deciles. D1 decile has (10)% values below it. e.g. if D8 of a
data set is 70. It would mean 80% (10*8=80) of the observations of that
data set are below 70. and 20% (100-80=20) observations have value
more than 70.

 Ni  
   C.F. 
10 
L   h
D1 =  F 
 
 

For example ;

 N1  
   C.F. 
10 
L    h
D1 =  F 
 
 

 N 2  
   C.F. 
10 
L   h
D2 =  F 
 
 

## l = Lower limit of the deciles class

c.f. = Cumulative Frequency (c.f.) of previous class
f = Frequency of the decile class
h = Width of the decile class

2.9.3 Percentiles
Percentiles are the values of the variant that divide the total fre-
quency into 100 equal parts. There are total 99 percentiles. Pi percentile
has i% values below it. e.g.if your score in an examination is 86 percen-
tile which is equal to the actual marks scored 70. It would mean 86%
candidates have scored less than 70 marks i.e.86% candidates have
scored less marks than you and 14% (100-86 = 14) candidates have
scored more marks than you.

The formula is

 N i  
   C.F. 
100 
l   h
Pi =  F 
 
 
For example;

 N1  
   C.F. 
100
l    h
P1 =  F 
 
 

## Measures of Central Tendency & Dispersion / 51

 N 2  
   C.F. 
100 
l    h
P2 =  F 
 
 
l = Lower limit of the percentile class
c.f. = Cumulative Frequency (c.f.) of previous class
f = Frequency of the percentile class
h = Width of the percentile class

## Ex : The distribution of fortnight wages of 280 employee of a com-

pany is as given below:

## Fortnight Wages in Rs. Frequency

Less than 200 12
200-400 16
400-600 38
600-800 78
800-1000 80
1000-1200 35
1200-1400 14
Above 1400 7
Total 280

Find first quartile, median, the third quartile. Find D4, P66.
Solution:
The cumalative frequency (cf) is as below:

## Wages (Rs.) Fre que ncy C.F.

L ess th an 2 00 12 12
2 00-40 0 16 28
4 00-60 0 38 66
6 00-80 0 78 144
8 00-10 00 80 224
1 000-1 200 35 259
1 200-1 400 14 273
Above 1400 7 280
To tal 28 0

Here class 200-400 means wages from Rs.200 (200 is lower
class limit and is included in the class) to less than 400 (400 is upper
class limit and is NOT included in the class)
Q1 corresponds (first quartile) to 280/4 = 70th observation which
lies interval 600-800 unit lower class boundary (L) = 600.
Interval contains 78 observations and the CF of earlier class is
66. Class width h = 200.

 N1  
   c.f . 
4 
l    h
Q1 =  f 
 
 

  280  1  
   66 
4 
600      200
Q1 =  78 
 
 

 280 
 4  66 
600     200
=  78 
 

 70  66 
= 600     200
 78 
= 600 + 10.256
= 610.256

## The median , which is second quartile (Q2) is given by Now

2 N/4 = 140 also lies in the same class.

 280  2 
 4  66 
600     200
Q2 =  78 
 
= 600 + 189.74
= 789.74
3  N 3  280
Q3 Corresponds (third quartile) = = = 210 observa-
4 4
tion lies in the interval 800-1000.

## Measures of Central Tendency & Dispersion / 53

Therefore , L = 800, F=80 , CF=144, h = 200.

 N 3 
 4  c.f . 
l   h
Q3 =  f 
 

 280  3 
 4  c.f . 
800     200
Q3 =  80 
 

= 800 + 165
= 965

D4
N i 280  4
The observation of 4th deciles corresponds to = = 112
12
10 10
Therefore, l = 600, f = 78, c.f. = 66, h = 200

 4  280 
 10  66 
600     200
D4 =  78 
 
= 600 + 117.95
= 717.95
P66

66  280
The observation corresponds to = 184th observation,
100
which lies in the interval 800-1000.

## Therefore , l = 800, f = 80 , cf = 144, h = 200

Ni 
 100  c.f . 
l   h
Pi =  f 
 

 66  280 
 100  144 
800     200
P66 = 80
 
 
= 800 + 102
= 902

Ex : For what value of x will 8 and x have the same mean (average)
as 27 and 5 ?
27  5
= 16 Therefore
2
x8
= 16
2
32 = x+8
24 = x

## Ex : On his first 5 biology tests. X received the following scores: 72,

86,92,63 and 77. What test score must X earn on his sixth test
so that his average (mean score) for all six tests will be 80?

Solution :
72  86  92  63  77  x
= = 80
6
( 80 ) ( 6 ) = 390 + x
480 = 390 + x
90 = x

## Ex : The mean (average) weight of three dogs is 38 kg. One of the

dogs X weighs 46 kg. The other two dogs, Y and Z, have the
same weight. Find Z’s weight.
Let x = Y’s weight
Therefore Z’s weight = x (they weigh the same so they are both
“x”)
Average: sum of the data divided by the number of data.

x  x  46
= 38
3(dogs )

## Measures of Central Tendency & Dispersion / 55

(38) (3) = 2x + 46
114 = 2x + 46
68 2x
=
2 2
X = 34

Z weighs 34 kg.

## Dispersion means the spread or variability of data.

Measures of Dispersion:

items and their common mean;
i) the mean deviation
ii) the standard deviation
i) the 10 to 90 percentile range
ii) the quartile deviation

The Range: The difference between the smallest and largest val-
ues of item in a set or distribution.
Ex : The daily number of books sold by two separate bookstores
over twelve days were:
Bookstore 1 : 3, 5, 1, 4, 5, 3, 6, 8, 6, 2, 3, 7
Bookstore 2 : 2, 3, 2, 1, 4, 3, 2, 2, 1, 3, 4, 1

## The range of values for Bookstore 1 is 8 – 1 = 7, and for Book-

store 2 is 4 – 1 = 3. Thus, daily sales are more variable for Bookstore 1.

## It is a measure of dispersion that gives the average absolute

difference between each item and the mean. Absolute means ignoring
the negative sign i.e Read | x | as absolute x.| -2 | = 2 and | 2 | = 2.

Mean Deviation for
Grouped Data
n
 | Xi  x |
M. D. = i 1
n

n
 fi | Xi  x |
i 1
M. D. = n
 fi
i 1
Even nothing is mentioned mean deviation MD is always taken

Ex : Find the range and calculate the mean deviation of 84, 92,
73, 67, 88, 74, 91, 74

Range = 92 - 67
= 25

84  92  73  67  88  74  91  74
Mean =
8
643
=
8
= 80.375

Mean Deviation =

## 3.625 + 11.625 + 7.375 + 13.375 + 7.625 + 6.375 + 10.625 + 6.375

8
67
=
8
= 8.375

In other words, each value in the set is, on average, 8.375 units
away from the common mean.

## Measures of Central Tendency & Dispersion / 57

Ex : Calculate the mean deviation from the following distribution:

## Number of orders 10-14 15-19 20-24 25-29 30-34 35-39

Number of weeks 3 17 15 20 9 4

## N umber of orders Number of weeks M id Point

(f ) (x) ( fx ) xx f | xx|
10-14 3 12 36 11.99 35.97
15-19 17 17 289 6.99 118.83
20-24 15 22 330 1.99 29.85
25-29 20 27 540 3.01 60.2
30-34 9 32 288 8.01 72.09
35-39 4 37 148 13.01 52.04

1631 368.98

## Mean number of orders : x = 1631

68
= 23.99

 fi | x  x |
Mean Deviation : M.D. ( x ) =
 fi
368.98
=
68
= 5.43

##  The Standard deviation is a measure of the average deviation

from the mean value.
 It is the most common measure of dispersion, (Remember: dis-
 It is used as a measure for comparison only when the units in
the distribution are the same and the respective means are com-
parable.

## Standard Deviation for :

1) Ungrouped Data

2  xi
 =  ( xi  x ) Where, x =
n n

2 2
 xi    xi 
 =  
n  n 
2) A frequency distribution:

2
 fi ( xi  x )
 =
 fi

2 2
 fi xi    fi xi 
 =  
 fi   fi 

## Find the standard deviation of 84, 92,73,67,88,74,91,74

Solution:

Xi
xi Xi22
xi
84 7056
92 8464
73 5329
67 4489
88 7744
74 5476
91 8281
74 5476
Total 64 3 5 231 5

 xi
x =
n
643
=
8
= 80.375
= 80.38
Measures of Central Tendency & Dispersion / 59
2
S.D =    xi  x 2
n

55315
=  (80.38)2
8
= 6539.375  6460.94
= 78.435
= 8.856

## Number of orders 10-14 15-19 20-24 25-29 30-34 35-39

Number of weeks 3 17 15 20 9 4

## No of orde r No of weeks Midpoint x fi xi fi xi²

10 – 14 3 12 36 4 32
15 – 19 7 17 289 49 13
20 – 24 15 22 330 72 60
25 – 29 20 27 540 14 580
30 – 34 9 32 288 92 16
35 – 39 4 37 148 54 76
Tota l 68 163 1 41 877

2
 fi xi  ( x ) 2
M.D =  =
 fi

41877
=  (23.99) 2
68

= 615.838  575.52
= 40.318
= 6.35

2.11 The Coefficient of Variation

##  When a comparison of two distributions and their means are

made, it is necessary to do so with regard to their variability.
 While the standard deviation is the important measure of spread,
it cannot be used as the sole basis of comparing two distributions
 This is because it is an absolute measure of dispersion that
measures variation in the same units as the original data.
(Remember that absolute values ignore the negative signs).
 For example, if we have a standard deviation of 10 and a mean
of 5, the values vary by an amount twice as the mean itself. If,
on the other hand, we have a standard deviation of 10 and a
mean of 5,000, the variation to the mean is insignificant. Therefore
we cannot know the dispersion of a set of data until we know
how the standard deviation compares with the mean.
 A relative measure of dispersion, which compares the mean to
the standard deviation, is the coefficient of variation, which is
found by dividing the standard deviation by the mean.

## Algebraically, this is: Coefficient of variation

= (SD / Mean).x 100
x 100
= ( ( / x ).100
x 100
Ex : Given the following data:
A: x = 120,  = 55
B: x = 90,  = 50

Solution:

## While  A >  B, the average value of A (120) is higher than that

of B (90). This means that deviations from the mean of A, and
thus the standard deviation will tend to be higher.

## A: Coefficient of variation = 55/ 120 = 45.8%

B: Coefficient of Variation = 50 / 90 = 55.6 %

## Measures of Central Tendency & Dispersion / 61

B has the higher relative variability in weekly wages.
Although the standard deviation for A is higher in absolute terms,
the dispersion for B is higher in relative terms.

2.12 Skewness

##  Skewness describes the extent of non-symmetry of a distribu-

tion.
 It can be positive (for a distribution which is skewed to the right),
negative (when a distribution is skewed to the left), or zero (for a
symmetric distribution).
 If a distribution is skewed, it means that values of the distribu-
tion are concentrated at either the low end or the high end of the
measuring scale on the horizontal axis. For example, the two
curves below are skewed distributions:

## A distribution in which the values of mean, median and mode

coincide (i.e. mean = median = mode) is known as a symmetrical
distribution. Conversely, when values of mean, median and mode are not
equal the distribution is known as asymmetrical or skewed distribution.
In moderately skewed or asymmetrical distribution a very important
relationship exists among these three measures of central tendency. In
such distributions the distance between the mean and median is about
one-third of the distance between the mean and mode, as will be clear
front the diagrams 1 and 2. Karl Pearson expressed this relationship as:

## Mode = 3 median – 2 mean ; mode = mean – 3 (mean – median)

2
And Median = mode + [mean – mode]
3

Skewness is a measure of the asymmetry of a frequency distri-
bution, and the skewness coefficient is included as one of the statistics.
A right or positive, skewed forecast has a greater destiny of values occur-
ring and the mode around the lower end of the range. A left, or negative,
skewed forecast displays the opposite trend. The skewness of a fre-
quency distribution can be an important consideration. For example, if
your forecast is Net profit, you would prefer a situation that led to a posi-
tively skewed distribution of profit to one that is negatively skewed (with
all else being equal).

## Below are some rough guidelines for interpreting skewness coef-

ficient values:
- Greater than 1 or less than -1 indicates a highly skewed distri-
bution ;
- Between 0.5 and 1 or 0.5 and -1 is moderately skewed ; and
- Between -0.5 and 0.5 indicates that the distribution is fairly sym-
metric.

Mean  Mode
Person’s skew (SKP) =

## In the alternative, the skewness coefficient can be Bowley’s co-

efficient or Kelly’s coefficient. Bowley’s coefficient is always between -1
and +1 both included.

Q 3  Q1  2 Media
Bowley’s skewness coefficient =
Q 3  Q1

## Where Q3 is the third quartile and Q1 is the first quartile.

1) Distribution has positive skew ness
 SKP > 0
mode < mean
2) Distribution has Negative skewness
SKP < 0
mode > mean
3) Distribution is symmetrical, skewness is zero
SKP = 0
Mode = mean

## Measures of Central Tendency & Dispersion / 63

 The Empirical Rule
- The standard deviation can be used to convey information about
variability in a collection of data.
- To illustrate this, we look at the case of a normal population –
this means that the data values have a bell – shaped histogram.
- It can be shown for such populations that about 68% of the data
lie within one standard deviations of the mean. About 95% within
two standard deviations of the mean , about 95% within two
standard deviations of the mean, and about 99% within three
standard deviations of the mean. This is shown in the diagram
below:

34% 34%

13.5% 13.5%
2% 2%

µ – 3σ μ – 2σ 1σ μ μμ+μ 1σ
μ – 1σ 1σ μ1σ μ +μμ+μ+1σ
1σ +1σ2σ μμ+++2σ
μ + μ2σ 2σ
3σ μ++3σ
μ + μ3σ 3σ

## It is sometimes important to know if a sample came from a nor-

mal population – to do so, it is necessary that the empirical rule should
be satisfied.

## Ex : The following values represent the scores of 40 students in

an exam.

46 58 65 70 76 49 59 66 71 78

50 59 66 71 79 53 60 66 72 80

54 62 66 73 82 55 63 68 73 83

55 64 68 73 84 57 65 69 74 88

## Given that the mean and standard deviation of this data is 67

and10 respectively, does this data satisfy the empirical rule?

Solution:

## Examining the data, we see that 26 of the numbers lie in the

range 57-77 i.e. within one standard deviation of the mean. These 26
numbers represents 26/40 or 65% of the data, and very close to the 68%,
which lie within one standard deviation for the empirical rule. Further cal-
culations are shown below:

## Within Number of Values Percent Empirical %

1 S. D. (57-77) 26 65 68
2 S. D.’s (47- 87) 38 95 95
3 S. D.’s (37-97) 40 100 99

As the percentage for this sample are very close to the empirical
rule, it is reasonable to conclude that this sample is coming from a nor-
mal population.

## The standard score is a measure of the position of a data value

relative to the other data values in the collection.

Formula:

x  mean
Z = Where Z = standard score

x = any value in a data set

The standard score (Z) of a data value (x) is the number of stan-
dard deviations that the data value is above or below the mean:

## Ex : The scores on a statistics exam had a mean of 50% and a

standard deviation of 5%

## The scores on an economics exam had a mean of 40% and a

standard deviation of 25%.

## Jill received a marks of 70 in statistics and jack received a mark

of 90 in the economics exam.

Solution :

## Subject Statistics Economics

Mean 50% 40%
S.D. 5% 25%

SD
Calculate coeffiecient of variation =  100
Mean

5
C.V (stats) =  100
50
= 10.

25
C.V. (Eco) =  100
40
= 62.5
C.V (stats) < C.V. (Eco)
 Result of stats is better than Eco.
Even give
Jill (Stats) = 70
Jack (Eco) = 90.

## While the range is a simple concept and it easy to calculate, it

only takes two values into account, and is obviously affected by extreme
values (outliers). Remember the Handout on Dispersion and skewness,
page 1.

## An attempt to remove this deficiency (extreme values) is to look

at the range in the middle part of the data, when arranged in ascending or
descending order.

## To do this, we begin by defining (see figure 1, page 151):

- The first quartile (denoted Q1) as the value with 25 percent of the
data below it.

For example, if a distribution has 8 values, Q1 is the value with 2
numbers less than it.
- The third quartile (denoted Q3) as the value which has 75 per-
cent of the data below it.
For example, if a distribution has 8 values, Q3 is the value with 6
numbers less than it.
- The range of the middle 50 percent of the data is found by sub-
tracting Q1 from Q3 – this is called the inter-quartile range.

## Sometimes, the term semi-interquartile range take quartile de-

viation) is used instead of the interquartile range – this is just the inter-
quartile range divided by two.
Q3  Q1
Semi-interquartile range = Quartile Deviation =
2

## Inter-quartile range (or the semi-interquartile range) for a frequency

distribution can be found using one of the two methods.

1. Graphical Approach :

## This involves computing a less –than ogives from a less –than

cumulative frequency distribution, and dropping horizontal lines
from 25% and 75% on the cumulative percentage scale to the
ogive , and reading Q1 and Q3 vertically downwards on the hori-
zontal axis.

2. Formula approach:

## This uses a known formula to generate the values of Q1 and Q3

respectively.
Note that these methods should both give the same answer.

## Ex : The Following table gives the current wage structure in a

company.
(Note : A grouped frequency distribution):

## Measures of Central Tendency & Dispersion / 67

Weekly Wages Number of Employees
Under 200 16
200 to under 225 153
225 to under 250 101
250 to under 275 92
275 to under 300 68
300 and over 50

approach?

Solution:

## To solve this problem, we must firstly convert the data into a

cumulative frequency distribution and then get the percentage cumula-
tive frequency:

200 16
225 169
250 270
275 362
300 430
325 480

## We then draw an ogive of this distribution:

From this graph , it can be seen that :
Q3 = Cumulative frequency is ¾ of 480, or 360 = £ 274
Q1 = Cumulative frequency is ¼ of 480, or 120 = £ 217
= > Inter –quartile range = £ 274 - £ 217 = £ 57

2. By formula

## To find the inter-quartile range of grouped data, we first find the

classes, which contain Q1 and Q3 respectively. This means that we
multiply the total frequency by ¼ and ¾ respectively and find the classes
contain these values.

2.14 Extreme values

The terms outlier and extreme values are often used interchange-
ably. Both refer to a data value that is atypical of the data set i.e. values
which differ markedly from most of the numbers in the set.

## For example, suppose that the number of championship matches

played by a team in the last five years is as follows:

2 1 10 1 1

## The average (mean) of these numbers is 3, which is heavily influ-

enced by the extreme value also called as outlier 10.

## Discarding the outlier, we obtain a modified mean of 1.25 which

is perhaps more meaningful for comparisons or for setting a norm.

## Alternatively, we could decide that the median is a better mea-

sure of average for data sets with outliers.

## Data sets should always be examined for outliers, as the rea-

sons for such values can vary – it may be due to weather conditions or
due to a recording error.

2.15 Summary

## Measures of centeral Tendency are Mean, Mode and Median.

Mean is nothing but average of given abservation. Mean is calculated in
two types of data:
- Ungrouped
- grouped

For ungrouped = =
 xi
x n

Where  xi = x1 + x2 + x3 + . . . . . + xn
n = No. of abservations given in data.
For grouped - Frequencies are given as fi;

Mean = =
 fi xi
x n
Likwise Meadian and Mode there are two different formule for
grouped and ungrouped data.

## Measures of Central Tendency & Dispersion / 69

1) Title
2) Plot area
3) Smoothed frequency polygon
4) More than and less than
5) 3
6) 5
7) 4
8) 10
9) Percentiles
10) Dispersion

## 1) Graphical presentation of Data.

2) Central Tendency.
3) Measures of Dispersion.
4) Types of Graphs.
5) Quartiles.


  

NOTES

NOTES

CHAPTER 3

## CORRELATION AND REGRESSION

3.0 Objectives
3.1 Introduction
3.2 Scatter diagram
3.3 Correlation & Covariance
3.4 Karl Pearson’s correlation coefficient
3.5 Spearman’s Rank Correlation
3.6 Coefficient of concurrent Deviation
3.7 Standard Error & Probable Error
3.8 Coefficient of Determination
3.9 Regression
3.9.1 Least Square Method
3.9.2 Properties of Regression coefficient
3.10 Residual Values
3.11 Standard Error Estimate
3.12 Limitations
3.13 Homoscedacity
3.14 Summary
3.16 Questions for Self - Study

3.0 Objectives

## After studying concept of correlation and Regression students

can explain the following –
 Relation between two variables.
 Different errors in calculations.
 Formule - to calculate correlation coefficient, converiance, etc.
 Use of regression.
 Limitations of regression and also can solve the examples with
given numerical data.

3.1 Introduction

## Correlation is relation between changes in values of two vari-

ables. If we increase the value of one of the variable by some amount,
what effect will it have on value of other variable? The answer is provided
by Correlation analysis. Let us take and example. We take a group of
students and record their weights and heights. If increase in weight also
corresponds to increase in height in general then we can say that the
height and weight of a group of students are correlated.

## Correlation is an analysis of the co-variation i.e. simultaneous

changes between the values of two or more variables.

## The correlation expresses the relationship between the groups,

but not between individual items.

## The relationship between two variables cannot be expressed as

exact mathematical relationship.

## Correlation analysis is a statistical procedure by which we can

determine the degree of association or relationship between two or more
variables. The amount of correlation in a data is measured as a coeffi-
cient of correlation, which is denoted by r.

## Scatter diagram is obtained by plotting values of one variable on

X axis and the other variable on Y axis e.g. if we want to study the rela-
tionship between heights and weights in a group of student, then mea-
surement of height and weight of each student will be recorded. Say for
example the following readings are obtained.

Student 1 2 3 4 5
Height 165 175 160 180 160
Weight 52 57 54 60 50

## Plot the points (52.165) , (57.175) , (54.160) , (60.180) , (50.160)

on a graph paper with suitable scale. This graph is a scatter diagram or
scatter plot.

I - Scale
X - Weight of a Student
Y - Height os a student
Correlation Co-efficient
= +1

## Figure 3.2.1 3.2.2

II - Correlation Co-
efficient = -1
Graph is Descending

3.2.3 3.2.4

3.2.5 3.2.6

## Let’s assume that in the graph the variable X on the horizontal

axis represents the weight of a student. And that the variable Y graphed
on the Y axis represents the height of the student. If the two variables
increase together and are perfectly correlated. All points in the scatter
diagram fall on a straight line, as you see in the upper left graph of the
above figure. In this case the correlation coefficient is a plus 1.

## If they are perfectly correlated and as one variable increases the

other decreases, the points lie on a falling straight line, as you see in the
upper right graph of. In this case the correlation is a minus 1(-1).

## Correlation & Regression / 75

In the middle life graph you see a little bit of a scatter about the
line. The correlation coefficient here is between 0 and 1.
The presence of a correlation means, given X value on the hori-
zontal axis you can make a prediction of the Y value by using the straight
line predictors you see in the graphs. The better the correlation, the
more accurate the prediction. The correlation coefficient just measures
the degree to which the scatter diagrams for the variables approximate
the straight lines Graph – correlation of coefficient.

## Graph Variables Correlation Interpretation

I X – Weight of a student +1 Two variables increase
Y – Height of a student together. Perfect correlation
II X – Health of a student -1 X – Increases
Y – Height of a student Y – Decreases Perfect
Correlation
III X – Health of a student Between 0 and 1 Not Perfect correlation
Y – I.Height
Q of aofstudent
a student
IV X – Isonomic Status 0 and -1 Not perfect correlation
Y – Height
I. Q of aofstudent
a student
V X – Health of a student Almost 0 No correlation
Y – Height
I. Q of aofstudent
a student
VII X – Social Status Almost 0 No correlation
Y – Height
I. Q of aofstudent
a student
In general – correlation co-efficient
1) +1 +0 - 1 varies
2) 0 No correlation

## Covariance provides a measure of the strength of the correlation

between two or more sets of random varieties. The covariance for two
variable random variables x and y, each with sample size n, is defined by
the expectation value.

Cov ( x, y) =  (x - x)(y - y)
n

  xy 
  xy 
=  n 
 
For uncorrelated variables the covariance is zero. However, if the
variables are correlated in some way, then their covariance will be non-
zero In fact, if cov ( x, y ) > 0, then y tends to increase as x increases,
and if cov ( x, y ) < 0, then Y tends to decrease as X increases. Not that
while statistically independent variables are always uncorrelated, the
converse is not necessarily true.

## cov (x, x) = x2 i.e var (x)

var (x + y) = var (x) + var (y) + 2cov (x, y)
var (x – y) = var (x) + var (y) - 2cov (x, y)
cov (x + z, y) = cov (x, y) + cov (x, z)
cov (ax + by) = ab.cov (x, y)

## The coefficient of correlation is a measure of correlation. Correla-

tion coefficient gives the degree to which the two variables are interre-
lated. It gives the degree of correlation.

## The coefficient of correlation between two variables x, y is gener-

ally denoted by r and rxy or rxy and rxy = ryx

## (1) Univariate distribution; - In this case there is only one variable

such as height of the student in a class.
(2) Bivariate distribution ; In this case there are two variables. Such
as height + weight of the students in a class.
(3) Covariance ; The corresponding values of the two variables x
and y on the given set of n pairs of observation be given by the
pair (x1 y1) , (x2 y2) … (xn yn)

## ( x1  X)( y1  Y )( x 2  X)(y 2  Y )...(x n  X)( y n  Y )

=
n
1
=  ( xi  X)( yi  Y)
n

## Where X and Y are mean, of x series and y series respec-

tively, The above formula for calculation of covariance is difficult and com-
plicated. Easier method of calculation is:

## Correlation & Regression / 77

1 1 
Cov (x, y) =   xi yi   xi  yi 
n n 

## Ex 1 Calculate the covariance of the following pairs of observa-

tion of two variable x and y (1,6) (2, 9) (3, 6) (4,7) (5,8)

 xi = 1 + 2 + 3 + 4 + 5 = 15
 yi = 6 + 9 + 6 + 7 + 8 = 36
 xi y i = 6 + 18 + 18 + 28 + 40 = 110

1 1 
Cov (xy) =   xi yi   xi  yi 
n n 

1 1 
= 110  (15)(36) 
5 5 

1
= 110  108
5

2
=
5
= 0.4

## The correlation coefficient r is only appropriate for measuring the

Degree of relationship between variables that are linearly related the points
to fall along and about an imaginary straight line that passes through the
clusters.

The two variables have bivariate normal distribution for any given
value.

## 3.4 Karl Pearson’s Correlation Coefficient

The method is used for measuring the linear ship between two
variables (series) Pearson’s coefficient between two variables (x.y) is
denoted by r (x, y) or r or ryx or by simply r . This is also know as product
moment correlation coefficient. It is the of the ratio of the co variance cov
(x , y ) to product of standard deviation of x and y.

cov(x, y)
r = xy
 = standard deviation

Now for n pairs of observation (x1 y1) (x2 y2) ……. (xn, yn)

1
cov ( x, y ) =  ( x  X)(y  Y )
n

1 2
X =  ( x  X)
n

1 2
Y =  (y  Y)
5n

 ( x  X)( y  Y )
r =
( x  X) 2  ( y  Y ) 2

 (dx , dy )
r = 2 2
 (d x )  (dy )

dx = ( x  X) and dy = ( y  Y )

Alternative formula :

n xy  ( x )(  y )
 2 2  2 2
r = n x  ( x )  n y  ( y ) 
   

## Ex 2 Calculate the coefficient of correlation for the following is

data (1, 2) (2, 4) (3 ,8) (4,7) (5, 10) (6,5) (7,14) (8,16) (9,2)
(10,20)

Ans

## Table for calculating correlation coefficient

Correlation & Regression / 79
2 2
X Y X Y XY
1 2 1 4 2
2 4 4 16 8
3 8 9 64 24
4 7 16 49 28
5 10 25 100 50
6 5 36 25 30
7 14 49 196 98
8 16 64 256 128
9 2 81 4 18
10 20 100 400 200
Tota l 55 88 385 586

## xy is calculated by multiplying corresponding value of x and y.

e, g 2 = 1 2
8 = 24

X
 xi  55  5.5
n 10

Y
 yi  88  8.8
n 10

n xy  ( x )( y )
r = n x 2  (  x )2  n y 2  (  y )2 
   

(10)(586 )  (55)(88)
= (10)(385)  (55)2  (10(1114)  (88)2 
   

1020 1020
= (825)(3396) =
2801700

1020
=
1673.5
= 0.61 ( approx )

Ex 3 The following table gives are the monthly income and sav-
ings of 10 persons. Calculate the correlation between
monthly income and savings.

Em ployee 1 2 3 4 5 6 7 8 9 10

Monthly 780 360 980 250 750 820 900 620 650 390

Incom e

Net saving 84 51 91 60 68 62 86 58 53 47

Solution:
6500
X = = 650
10
660
Y = = 66
10

2 2
No X Y X XX Y YY X Y Xy

## 1 780 84 130 18 16900 324 2340

2 360 51 -290 -15 84100 225 4350
3 980 91 330 25 108900 625 8250
4 250 60 -400 -6 160000 36 2400
5 750 68 100 2 10000 4 200
6 820 62 170 -4 28900 16 680
7 900 86 250 20 62500 400 5000
8 620 58 -30 -8 900 64 240
9 650 53 0 -13 0 169 0
10 390 47 -260 -19 67600 361 4940

## Correlation & Regression / 81

 ( x  X)( y  Y )
r = 2 2
 ( x  X)  (y  Y)

 xy
r =
 x 2y 2

27040
r =
537800 2224

= 0.78

## The value r indicate, a high degree of association between the

variables X and Y.

## The coefficient of rank correlation is denoted by R.

This is applied to a problem where there is no quantitative data.
But qualitative data is available.

 6 D 2 
R = 1-  
 n(n2  1) 

## D2 = Square of the difference of corresponding ranks,

n = number of paired observations,

## Ex 4 The ranking of 10 students in Statistics and Accountancy

are follows,

Sta tistics 3 5 8 4 7 10 2 1 6 9
Accountancy 6 4 9 8 1 2 3 10 5 7

## What’s the coefficient of rank correlation?

2
Rank X Rank y D D
3 6 -3 9
5 4 1 1
8 9 -1 1
4 8 -4 16
7 1 6 36
10 2 8 64
2 3 1 1
1 10 9 81
6 5 1 1
9 7 2 4
Total 214

 6 D 2 
R = 1-  
 n(n2  1) 

6(214)
= 1-
10(10 2  1)
= - 0. 2 9 7

## Rank Correlation Coefficient

When the Ranks are not given
1. Assign the rank highest first and the lowest last on both x and y
2. Find the Rank difference (D), then D2
3. Apply formula as done earlier.

## EX 5 Calculate the rank correlation coefficient from the following

data.

x: 75 88 95 70 60 80 81 50
y: 120 134 150 115 110 140 142 100

## Correlation & Regression / 83

Assign first rank to the highest
x – 95, 88, 81, 80, 75, 70, 60, 50
y – 150, 142, 140, 134, 120, 115, 110, 100

75 5 120 5 0 0
88 2 134 4 -2 4
95 1 150 1 0 0
70 6 115 6 0 0
60 7 110 7 0 0
80 4 140 3 1 1
81 3 142 2 1 1
50 8 100 8 0 0
2
? D = 6

rRR = n=8

D2 = 6

66
= 1
8(64  1)

1
= 1
21

20
=
21
= 0.93

## Rank correlation coefficient is easy to understand and calculate.

It is useful for ordinal data.

## The shortcomings are it cannot be used for grouped frequency

distribution but it is not as accurate as Karl Pearson’s coefficient of cor-
relation. It is cumbersome to use for large data and it cannot be used for
continuous variable.

3.6 Coefficient of Concurrent Deviation

## The Principle underlying in the coefficient of concurrent deviation

is that the variables will fluctuate the way in which short term fluctuations
take place. If the majority of short term fluctuations are in the same direc-
tion then the variables will have positive correlation and if he the majority
of short term fluctuations are in the opposite direction then the variables
will have negative correlation.

rC =  (2cm)/ m

## c = number of pairs of concurrent deviations

m= number of pairs of deviations, which one Less that actual num-
bers N
m= N -1

## Ex 6 Calculate the coefficient of correlation between price and

demand by concurrent deviation method.

Price : 1 4 3 5 5 8 10 10 11 15
Demand : 100 80 80 60 58 50 40 40 35 30

Solution :
Price (X) CX Demand ( y ) CY CXCY
1 100
4 + 80 -
3 - 80 0
5 + 60 -
5 0 58 -
8 + 50 -
10 + 40 -
10 0 40 0 +
11 + 35 -
15 + 30 -
C=1

## Correlation & Regression / 85

Here
m = N - 1 = 10 - 1 = 9

rc =  (2cm)/ m

rc = ( 2 1 9 ) / 9

= - 0 . 84

## Advantages of the coefficient of concurrent deviation are it is simple

to understand and compute and it is extremely useful for short term fluc-
tuation analysis.

The disadvantages are it is not useful for long term range. It does
not differentiate between small and big variations. The results are rough
indicator and not as accurate as other methods.

## 1) Correlation is an analysis of the _______________.

2) Two variables increase together mean there is _______________
correlation.
3) Univariate distribution there is only _______________ variable
in the data to study.
4) _______________ there are two variables under study.
5) SE means _______________.
6) The square of the correlat ion coef f icient is cal led
_______________.
7) SLR Analysis indicates that there is only one _______________.
8) If two or more independent Variables then _______________
Analysis.
9) byx is the regression coefficient of _______________.
10) _______________ means variance around the regression line
is the same for all values of variable x.

3.7 Standard Error and Probable Error

## Standard error (SE) of coefficient of correlation is given by

1 r2
SE =
n
r = Coefficient of Correlation
n = number of observations in Pairs.

## Probable error of the coefficient of correlation (P.E.) is given by

PE = 0.6745 (SE) = (2/3) (SE)

## This helps in interpreting value of correlation coefficient

Properties of P. E.
1) if r = 6 (PE) then it is not significant
2) if, r  6 (PE) then it is significant & correlation exist.
Thus PE is used for testing the reliability value of r.

## The square of the correlation coefficient (r2) is called as coeffi-

cient of determination.

## Coefficient of determination is the ratio of explained variation to

the total variation. It tells us what proportion of variation in dependent
variable can be attributed to the variation in the independent variable.

## Ex 7 If r = 0.8, what is the percentage of explained variation?

r = 0.8, r2 = 0.64.

## Therefore, 64% variation can be explained. (100 – 64 = 36%

variation is unexplained)

error of r

(1 r 2 )
P.E = 0.6745
n

## Correlation & Regression / 87

P . E. = 0.072

0.6745(1  r 2 )
0.072 =
25

0.6745(1  r 2 )
0.072 =
5

0.072  5
(1  r 2 ) =
0.6745

0.360
=
0.6745

360
=
674.5
= 0.5333

r2 = 1 - 0.533
= 0.467

r = 0.467
= 0.6833

(1 r 2 )
Standard Error SE =
n

0.533
=
5
= 0.1066

3.9 Regression

## Regression Analysis is the procedure used to establish the rela-

tionship between the dependent or response variable (Y) and the inde-
pendent or explanatory variable (X), and it is used for estimation of values
of one variable given the value of the other variable.
Simple Linear Regression Analysis usually begins by plotting
the set of (X,Y) values on a scatter diagram and determining by inspec-
tion if there exists an approximate linear relationship.
Y = a + byxX

Since the points are unlikely to fall precisely on the line, the
exact linear relationship must be modified to include an error (Stochastic
or random disturbance) term

Y = a + byxX + e

## Correlation Analysis is concerned with measuring the degree of

relationship between variables, rather than estimating the value of a de-
pendent variable.

## Linear Regression Analysis indicates that the algebraic model

defines a straight line.

## Simple Linear Regression (SLR) Analysis – Indicates that there

is only one independent variable, while

## Multiple Regression Analysis – Indicates two or more indepen-

dent / explanatory variables in estimating the value of the dependent vari-
able.
Some assumptions associated with simple linear regression
analysis

1. Linearity

## The independent and dependent variables have a linear relation-

ship. represented by equation :

Y = a + byxX + e

Ŷ = a + byxX

## Where Y is value of the dependent variable in the observation. Ŷ

is the estimated value of the dependent variable from the linear relation-
ship. a is the first parameter of the regression equation, indicating the
value of Y when X = 0 and byx is the second parameter of the regression
equation, indicating the slope of the regression line, e is the random error
in the same observation, associated with sampling process.
Scatter Diagram is a graph that show the relationship between
the two variables and can be used to observe whether there is a general
agreement with the assumptions underlying regression analysis.

## Correlation & Regression / 89

An alternative graph to determine such agreement is Residual
Graph which is a graph of the residuals e = Y – Ŷ with respect to the
values of Ŷ .
Direct (Positive) Relationship indicates that the values of the
dependent variable Y generally increases as the values of the indepen-
dent variable X increase.
Inverse (Negative) Relationship indicates that the values of Y gen-
erally decrease as the values of X increase.
The general degree of relationship between the variables is indi-
cated by the extent of scatter with respect to the best fitting line.
The mathematical criterion generally used to determine the lin-
ear regression equation is the Least Squares Criterion by which the sum
of the squared deviations between the actual and estimated values of the
dependent variable is minimized.
The parameters A and Byx (Population) in the linear regression
model are estimated by the values of a and byx based on the sample.
Thus, the linear regression equation to determine the estimated (com-
puted) value of the dependent variable, given a value of X for the indepen-
dent variable is :
Ŷ = Y = a + byxX

## We can have regression equation to calculate value of X given

the value of Y. It takes the form
X = a + bxyY

3.9.1 Least Squares method

## Least squares method is a technique for fitting the ‘best’ straight

line to the sample of (X,Y) observations. When we want to estimate Y
from X the sum of the square (vertical) of Y deviations of points from the
line is to be minimized.

Minimize  ( Y Ŷ ) 2

## Ŷ = Estimated value or value that is calculated from the equa-

tion of the line which is derived for the data so that the sum of squares of
the errors are minimum.

## When we want to estimate X from Y the sum of the square (Hori-

zontal) of X deviations of points from the line is to be minimized.

Minimize  ( X  X̂ ) 2

## X̂ = Estimated value or value that is calculated from the equa-

tion of the line which is derived for the data so that the sum or squares of
the errors are minimum. Thus it would be clear that when we want to
estimate value of y from the given value of x we will need the equation
which is different from the equation which is needed to estimate the value
of x from given value of y. However, if the correlation coefficient is + 1 then
only one and the same equation is used to estimate the values of both
the variables.

## Regression equation of y on x is used to estimate the value of y

from given value of x.

It is expressed as y = a + byxx

## Where a and b are constant representing the y intercept and the

slope of the line respectively. Values of a and b are obtained by solving
the following equations which are called as normal equations for y on x.
These equations are obtained from n given pairs of observations for y and x.
 y=na + b  x

## Correlation & Regression / 91

xy = a  x + b  x2

## we can also use the form

yy = b yx ( x  x )

## where byx is the regression coefficient of y on x which is given by

Cov( x, y ) y
b yx = 2 = r =
y x

where,

dx = X  X and dy = Y yŶ

or alternatively,

b yx
n  xy  ( x)( y)
= n y 2  (  y ) 2
or

 xy  n X  Y
b yx = 2
 y2  n  Y
3.9.2 Properties of regression Co-efficient

## the correlation coefficient where ryx  ryx  r .

2) Both regression coefficient bxy and byx have the same sign.

## 3) r 2  b xy  b yx r = correlation coefficient between x and y .

where r has the same sign (+ or -) as that bxy and byx.
Regression coefficients are independent of change of origin but
are dependent on change of scale.

## Ex 9 Estimate Performance Rating of Industrial Trainees on

Basis of Selection Test Scores :

Sampled
1 2 3 4 5 6 7 8 9 10
individual

Solution :
Selection 88 85 72 93 70 74 78 93 82 92
Test Score

Perform
17 16 13 18 11 14 15 19 16 20
Rating

## equation for data X and Y, We have:

Sampled 11 12 13 14 15
individual

Selection 79 84 71 77 87
Test Score

Perform
14 15 12 13 19
Rating

## Correlation & Regression / 93

We can use the above table for determining the linear regression
2 2
Sampled Selection Perform XY X Y
individual Test Score X Rating Y
1 88 17 1496 7744 289
2 85 16 1360 7225 256
3 72 13 936 5184 169
4 93 18 1674 8649 324
5 70 11 720 4900 121
6 74 14 1036 5476 196
7 78 15 1170 6084 225
8 93 19 1767 8649 361
9 82 16 1312 6724 256
10 92 20 1840 8464 400
11 79 14 1106 6241 196
12 84 15 1260 7056 225
13 71 12 852 5041 144
14 77 13 1001 5929 169
15 87 19 1653 7569 361
16 87 17 1479 7569 289
17 72 10 720 5184 100
18 77 12 924 5929 144
19 82 14 1148 6724 196
20 76 13 988 5776 169
Total 1,619 298 24492 132117 4590

 xy  n X Y
b xy = 2 2
 x  nX

## Where X =  x / n = 1619 / 20 = 80.95

Y =  y / n = 298 / 20 = 14.90

 xy  n XY
byx = 2 2
 x  nX
24,492  20(80.95)(14.90)
=
132,117  20(80.95)2

24,492  24123.1
=
132,117  131,058.05

368.9
=
10578.95

= 0.3484
= 0.35

a = Y  byx X
= 14.90 – 0.35 ( 80.95 )
= 14.90 – 28.3325
= -13.43
Therefore the regression equation for estimating the performance
rating on the basis of selection test score is :

Ŷ = a + bX

Ŷ = - 13.43 + 0.35X

The value of bxy = 0.35 indicates that the slope of the regression
line is 0.35 indicating that for each increase of one point in the selection
test score, there is an increase of 0.35 in the performance rating. On the
average. Therefore, a direct (positive) relationship exists between these
two variables.
The value of a = -13.43 may look a bit puzzling. Graphically, this
is the point of intersection of the regression line with the Y axis; hence
this is the value of Y when X = 0, but how can there be a ‘negative’
performance rating when the data indicate that only positive ratings are
assessed? The answer is that any regression equation is only meaning-
ful for the range of the values of the independent variable included in the
sample.
Now, if a trainee applicant has a selection test score of 90, the
estimated performance rating on the job is :

Ŷ = a + byx X
= -13.43 + 0.35 ( 90 ) = -13.43 + 31.50 = 18.07  18

## Correlation & Regression / 95

3.10 Residual Values

## A Residual (e) is the difference between the observed value (Y)

and the estimated value ( Ŷ ), it is also called as error and is given by:

e = Y – Ŷ

## A residual graph is plot of the residuals e with respect to the

fitted regression line values Ŷ .

## The closer the observations fall to the regression line (that is

smaller the residuals, error terms), the greater is the variation in Y (de-
pendent variable) ‘explained’ by the estimated regression equation. Total
variation in Y is equal to the explained plus the residual variation.

## Sampled Selection Test Perform Fitted Residual

individual Score X Rating Y value Y e = Y – Ŷ
1 88 17 17.37 -0.37
2 85 16 16.32 -0.32
3 72 13 11.77 1.23
4 93 18 19.12 - 1.12
5 70 11 11.07 -0.07
6 74 14 12.47 1.53
7 78 15 13.87 1.13
8 93 19 19.12 -0.12
9 82 16 15.27 0.73
10 92 20 18.77 1.23
11 79 14 14.22 -0.22
12 84 15 15.97 -0.97
13 71 12 11.42 0.58
14 77 13 13.52 -0.52
15 87 19 17.02 1.98

16 87 17 17.02 -0.02
17 72 10 11.77 -1.77
18 77 12 13.52 -1.52
19 82 14 15.27 -1.27
20 76 13 13.17 -0.17

## Total Sum of Squares (TSS) = Total variation in y =  ( y  Y )2

2
Regression Sum of Squares (RSS) = Explained Variation =  ( Ŷ  Y )

2
Error Sum of Squares (ESS) = Residual variation in y =  ( y  Ŷ )

## TSS = RSS + ESS

Dividing both sides by TSS gives us

## Coefficient of Determination (r2) is then defined as the proportion

of the total variation in Y (dependent variable) explained by the regression
of Y on X (independent variable)

r2 = RSS/TSS = 1 – (ESS/TSS)
i.e. (ESS/TSS) = 1 – r2

## Thus the quantity (1 – r2) is called as coefficient of non determi-

nation as it tells the proportion of the variation in one variable which can-
not be attributed to the variation in the other variable.
Coefficient of Correlation (r) on the other hand, is the square root
of the coefficient of determination, with the arithmetic sign being desig-
nated as positive if the relationship is direct and negative if the relation-
ship is inverse.

## We have seen that the correlation studies the relationship be-

tween two variables X and Y, if value of one variable is given, to estimate
the value give other.

## Regression analysis is used for estimating or predicting the un-

known values of a variable (called as dependent variable) from the known
values of other (called as independent variable). The regression line which
describes the average relationship between the variable x and y.

## Correlation & Regression / 97

3.11 Standard Error Estimate

## It measures deviation (dispersion) of the central values about the

regression line is given by

## Syx = (standard error of estimate y for given x)

= unexplaine d error / n =  (Y - Yˆ ) 2 / n

Syx = y 1- r2

## The standard error of estimate of x for given y as Sxy

 (x - Xˆ ) 2 / n
Where X is the observed value.

## Standard error of estimate measures the accuracy of the esti-

mated figures. The smaller its value, the better are the estimates & hence
more representative is the regression line.

## Ex 11 The following data given the experience of machine opera-

tors and their performance ratings by the number of good
turned out per 100 pieces.

Operator 1 2 3 4 5 6 7 8
Experience (x) 16 12 18 4 3 10 5 12
Ratings (y) 87 88 89 68 78 80 75 83

## Calculate the regression line of performance ratings on experi-

ence and estimate the probable performance if an operator has y 7 year
experience.

Solution :

n=8

X
 x  80  10
we have
n 8

Y
 y  648  81
n 8

## x y dx = x - 10 dy (y – 81) dx 2 dy2 dx. dy x2 y2 xy

16 87 6 6 36 36 36 256 7 569 13 92
12 88 2 7 4 49 14 144 7 744 10 56
18 89 8 8 64 64 64 324 792 16 02
4 68 -6 -1 3 36 16 9 78 16 4 624 2 72
3 78 -7 -3 49 9 21 9 6 884 2 34
10 80 0 -1 0 36 0 100 6 400 8 00
5 75 -5 -6 25 36 30 25 5 625 3 75
12 83 2 2 4 4 4 144 6 889 9 96
80 6 48 0 0 218 31 8 247 1018 53676 67 27

 dx dy 247
b yx = 2 = = 1.133
 dx 218
By direct method

(n xy   x )(  y )
b yx =
n x 2  ( x)2

8  6727  80  648
=
8  1080  (80)2

1976
=
1744
= 1.133
Equation of regression line on x is
y  Y  b yx ( x  X)
y – 81 = 1.133 (x – 10)

## Correlation & Regression / 99

y – 81 = 1.133x – 11.33
y = 1.133x + 81 – 11.33
= 1.133 + 69.67 ans.
If experience is 7 years, the probable performance will be.
x=7
y = 1.133 X + 69.67
= (1.133) 7 + 69.67
= 7.991 + 69.67
= 77.66

## Ex 12 Given the following Data.

X Y

Mean 36 85

Standard deviation 11 08

## Correlation Coefficient between x and y is 0.66 find the regres-

sion equation x and y hence estimate value of x when y = 75.

Solution :
Given x = 36, y = 85

σ x = 11 σy 8
r = 0.66

x 11
Now, b xy = r = 0.66  = 0.908
y 8
The regression equation x on y
x- X = bxy (y - Y )
x – 36 = 0.908 ( y – 85)
x – 36 = 0.908 y – 77.180
x = 0.908 y – 77.180 + 36
= 0.908 y – 41.180
When Y = 75, then x will be;
X = 0.908 X 75 – 41.180
= 36.92 Ans.
3.12 Limitations of correlation and regression analysis

## Correlation is symmetrical statistical variable (tool). When two

variables dependent or independent then this relation gives falsely
results.

## To an extent that there is a non-linear relationship between the

two variables to be correlated, correlation will not consider this kind of
relationship.

3.13 Homoscedacity

This means that variance around the regression line is the same
for all values of predictor variable x. The plot shows a violation of this
assumption. For the lower values, the points are all very near the regres-
sion line. For higher values on the x-axis, there is much more variability
around the regression line.

3.14 Summary

## In this lesson we learnt about new statistical tool as correlatin,

line of regression, Rank Correlation. Types of Errors etc., These all are
very important to calculate dependent variable with independent variable.

1) Co-Variance
2) Perfect
3) One
4) Bivariate distribution
5) Standard error
6) Coefficient of determination
7) independent Variable
8) Multiple Regression
9) y on x
10) Homescedacity

## Correlation & Regression / 101

3.16 Questions for Self - Study

Short Notes:
1) Coefficient of concurrent
2) Types of Errors
3) Line of Regression
4) Homescedacity
5) Standard Error Estimate


  

NOTES

NOTES

CHAPTER 4

## PROBABILITY AND DISTRIBUTIONS

4.0 Objectives
4.1 Introduction
4.2 Important Definations
4.3 Basic Calculations in Probability
4.4 Basics of Permutations and Combinations
4.5 Set Theory & Probability Theorems
4.6 Baye’s Theorem
4.7 Mathematical Expectations or Expected Values
4.8 Binomial Distribution
4.9 Poisson Distribution
4.9.1 Properties of Poission Distribution
4.9.2 Examples of Events
4.10 Normal Distribution
4.10.1 Properties of Normal Distribution
4.12 Summary
4.14 Questions for Self - Study

4.0 Objectives

## After studying the concepts of probability and Distributions,

calculations and all the probable formula students can explain the
following –
 Concept of probability
 Events
 Expectations
 Expected values
 Combinations
 Permutations
Students can understand the problems and solve with given nu-
Probability & Distributions / 105
4.1 Introduction

## Probability is applied in business to predict possible situations

out of all probable situations for making decision. After reading this chap-
ter, students can understand the mathematical theory of probability for
making business decision. Also we consider some situations where di-
rect theoretical results can be applied.

## Decision making models can be broadly classified as determin-

istic models and probabilistic models. In deterministic model, we do not
consider any uncertainty, while the real life situations are full of uncer-
tainties. And if these uncertainties are not included in decision making
process, one may end up in making incorrect decision incurring losses.

## Essence of probabilistic model is theory of probability. Probabil-

ity is a measure of uncertainty or certainty whichever way we define. For
example, if we say that probability that product will be sold is 0.6 then we
are 60% sure (certainty) that product will be sold and we are 40% unsure
(uncertainty) about the sale of the product.

## 4.2 Important Definitions

 Probability
It is theory of chance when taken as science.
It is chance of happening an event when considered in connec-
tion with the event. Probability of any event is between 0 and 1, both
included. Probability is also defined as the percentage of times for which
a specific out come would happen if the same experiment were repeated
number of times.

##  Mathematical or Objective probability

Probability theory, which is based on statistical data and prob-
ability axioms, is called as mathematical probability.

 Axioms of probability
There are three axioms of probability : (1) Chances are always
at least zero (2) The maximum chance that something happens is
100% (3) If two events cannot both occur at the same time, the chance
that either one occurs is the sum of the chances that each occurs.

 Subjective Probability
Probability theory, which is based on feeling or thinking of a per-
son, is called as subjective probability.

 Conditional Probability
It is probability of an event that is calculated on the assumption
that some related has happened.

 Experiment
Action whose outcomes are of interest to us is called as an
experiment. e.g. toss of a coin. Chance of getting head or tail at a time is
exactly one half.

##  Event and Happening of an event.

Event is a set of one or more outcomes of an experiment. An
event is said to have happened if the outcome is the result of the experi-
ment. e.g. In the experiment of tossing of a coin there are two outcomes
head and tail. Two events A and B can be defined as follows:
Event A: Head shows up in the experiment of tossing of a coin.
Event B: Tail shows up in the experiment of tossing of a coin.
Now if head shows up, then we can say that event A has happened.
Probability of an event A is denoted by P(A).

 Sample Space
Set of all possible out comes of an experiment is called as sample
space.

 Dependent Events.
If happening of one event changes the probability of another event
then those events are said to be dependent events.

 Independent Events
If happening of one event does not change the probability of an-
other event then those events are said to be independent events.

##  Mutually Exclusive Events

Two or more events which cannot happen at the same point of
time are called as mutually exclusive events.

 Exhaustive Events
If two or more events cover the entire sample space i.e if two or
more events cover all possible outcomes of an experiment, then such
events are called as exhaustive events.

 Impossible Event
If probability of a happening of an event is 0, the event is an
impossible event.

## Probability & Distributions / 107

 Certain Event
If probability of a happening of an event is 1, the event is a certain
event.

 Complement of an event
Complement of an event means that event does not happen. i.e.
if event A is getting 1 in a throw of dice then A complement is not getting
1 in a throw of dice. Complement of event A is denoted by (Ac , A’ or
c
A ) and P(A ) = 1 – P( A ).

##  Equally Likely Events.

Two or more events are called as equally likely if they have the
same probability of occurrence.

##  Mathematical Expectation, Expected Value.

For discrete distribution expected value is weighted mean of all
outcomes where weights are probability of outcome. If X and Y are two
random variables , the expected value of their sum is the sum of their
expected values (E(X+Y) = E(X) + E(Y)), and the expected value of a
constant a times random variable X is the constant times the expected
value of X ( E ( a  X) = a  E ( X )).

## 1) Probability of any event is between _______________.

2) Action whose outcomes are of interest to us is called as
_______________.

## 3) In experiment of tossing of coin there are two outcomes

_______________.
4) Set of all possible outcomes of an experiment is called as
_______________.
5) If happening of one event changes the probability of another
event then those events are said to _______________.

4.3 Basic Calculations in probability

## Probability is denoted by symbol ‘P’. This is expressed in a frac-

tion or decimal or in percentage. e.g. probability of getting head in a toss
of coin is ½ or 0.5 or 50%.

P(A) =
Total Outcomes

## Ex 1: If a card is drawn from a pack of well shuffled cards find the

probability that it is a queen.

## Total number of outcomes = 52 (Total 52 cards)

Number of favourable outcomes = 4 (4 queens)
Thus P(A) = 4/52 = 1/13

## A (A bar) denotes the non happening of A then it is probability of not

getting a queen:-

1 12
P(A) = 1 – P(A) = 1 - =
13 13
Thus,
P (A) + p ( A ) = 1

1 12
+ =1
13 13
Now, event A is certain to occur then P(A) = 1 and P( A ) = 0
Alternatively probability can be defined as

n( A )
P(A) =
n(S)
Let us consider tossing of a coin. The outcomes are head or tail.
S denotes a complete set of outcomes for a given situation and it is
called as sample space or universe. Thus in above experiment, Sample
space = S = {H,T}
Let us define event A : Getting a head on the top surface.
Therefore A = {H}
Now n(A) denotes number of elements in the set A. Since set A
has only one element n(A) = 1. Set S, sample space has 2 elements in
Probability & Distributions / 109
it. Therefore, n(S) = 2. Thus probability that head is obtained in the
tossing of a coin is;

n( A ) 1
P(A) = =
n(S) 2
If we apply the first definition, then the number of favourable out-
comes are the ones in which we are interested. In this case we are
interested only in head i.e. number of favorable outcomes is only 1. Total
number of outcomes is 2. Again the probability getting head is ½.

## Ex 2 : Let us consider the toss of two coins at a time toss or toss

of one coin one after another. The possibilities are

Tail Tail

## Event A: Getting two heads in the toss of two coins

Therefore A = {(H.H)}
i.e. n(A) = 1. n(S) = 4
Therefore P(A) = n(A) / n(S) = ¼

## Event B: Getting at least one head in the toss of two coins

Therefore B = {(H,T), (T,H), (H,H)} (At least one, therefore one
i.e. n ( A ) = 3. n ( S ) = 4
Therefore P ( A ) = n ( A ) / n ( S ) = ¾

## Ex 3: Two dice are thrown simultaneously and the points on the

dice are multiplied together. Find probability that the prod-
uct is 4.

Solution:

## Event A: Product of scores obtained is 4

n ( S ) = (Number of outcomes in one trial) Number of trials

n ( S ) = 62 = 36
Favourable outcomes are
A = { (1x 4), (2 x 2), (4 x 1) }
i.e. n(A) = 3

3 1
P(A) = =
36 12

## Ex 4 : What is the probability of getting three white balls in a draw

of 3 balls from a box containing 5 white and 4 black balls?

## A short break for basics of permutations and combinations. Sup-

pose there three balls Red, Blue and Yellow. If 2 balls are selected out of
these 3 balls then possible selections are Red, Blue or Blue, Yellow or
Yellow, Red i.e. there ways in which selection can be made.
This is written as 3C2 which means select 2 objects (here balls)
at a time out of 3 objects, i.e. combinations of 2 objects taken at a time
out of 3 objects. And it is calculated as-

3
C2 = 3 x 2 / 1 x 2 = 3 (in the denominator go on multiplying up to
the number after C and in the numerator go on multiplying in the reverse
direction in the decreasing order starting from the number before C for the
same number of digits as that of in the denominator.)

10
C3 = 10 x 9 x 8 / 1 x 2 x 3 = 120

10 ! 10 !
= =
(10 - 3) ! 3 ! 7 ! 3!

7 ! 8  9  10
=
7 ! 1  2  3
Technical definition for combination is
n!
n =
Cr r ! (n - r)!

## It is read as combinations of objects taken at a time out of n objects.

n! = 1,2,3,………n
= n(n–1)(n–2)…….1

## Probability & Distributions / 111

e.g. 4! = 1.2.3.4
= 4.3.2.1 = 24
10
C3 = 10 ! / 3 ( 10 – 3 ) ! = 10 ! / 3 ! 7 !
= 10.9.8 / 1.2.3. = 120
0! = 1
1! = 1
n! = n(n–1)!
=n(n–1)(n–2)! And so on
i.e. 10 ! = 10.9.8 ! = 10.9.8.7 ! And so on
n
c0 = 1
n
cn = 1
n
c1 = n
n
cn = ncn-r i.e. 10C7 = 10
C10-7 = 10
C3

## Now what if arrangements of the selected balls are also impor-

tant? Arrangement means the order in which objects are presented is
also important.

## Following arrangements are possible

Red, Blue or Blue, Red
Blue, Yellow or Yellow, Blue
Yellow, Red or Red, Yellow
Selections and their arrangements is called as permutations.
This is written as 3P2 which means select and arrange 2 objects
(here balls) at a time out of 3 objects. i.e. permutations of 2 objects taken
at a time out of 3 objects. And it is calculated as -

8P2 = 3.2 = 56

10
P3 = 10.9.8 = 720
Technical definition for permutation is;

n!
n
Pr =
(n  r) !
It is read as permutations of r objects taken at a time out of n
objects.
10
P3 = 10 ! / ( 10 – 3 ) ! = 10 ! / 7 ! = 10.9.8

= 10.9.8 = 980
n
P0 = 1
n
Pn = n !
n
P1 = n
Getting back to our problem
Balls drawn = 3  from a box containing 5W and 4B balls.
White balls = 5
Black balls = 4
Total Balls = 9

Let A be the event where 3 white balls are drawn. Now , 3 white
balls must to come from 5 available white balls.
N(A) = Number of ways in which 3 white balls can be drawn out
of 5 balls

5
5.4.3
= C3 =
1.2.3
= 10 ways

## Number of total outcomes:

N ( S ) is Number of ways in which 3 balls can be drawn out of total 9
balls

9
9.8.7
= C3 = = 3.4.7
1.2.3
= 84 ways

n(A) 10 5
P(A) = = =
n (S) 84 42

## Fill in the blanks.

1) 0 ! = _______________.
2) 1! = _______________.
3) n C = _______________.
r

4) n C = _______________.
n

5) n C = _______________.
i

## Probability & Distributions / 113

4.5 Set Theory and Probability Theorems

## We will learn sets in Business Mathematics. But for probability

the concept of sets must be very clear in mind.

## 1) The set of Natural Number N

N = { 1,2,3, ............. }

## So set means a group of articles, numbers, persons under some

special condition.

## Theorem 1 - In a random experiment, if S be the sample space

and A and event, then -

i) P ( A)  0
ii) P( )  0
iii) P (S )  0
Proof - Since A is an event, therefore ACS
n( A)
i) P ( A)  0
n( S )

n( ) 0
ii) P( )   0
n( S ) n( S )

n( S )
iii) P( S )  1
n( S )
Points to Remember (Most IMP)

## 3) Probability of sure event is 1 (One).

Theory 2 - If A and B are mutually exclusive events then
P( A  B)  0

## Since A and B are mutually exclusive events.

 AB = 
n( ) 0
 P ( A  B )  P ( )   0
n(S ) n(S )
Theorem 3 - If A and B are two mutually exclusive events, then
P(A) + P(B) = 1

## Proof - Let S be the sample space

A and B be two mutually exclusive events.
Then AB = 
and AB = S
 A and B mutually exclusive events
 AB = 
P(AB) = p() = 0
 P(AB) = P(A) + P(B) - P(AB)
= P(A) + P(B) - 0
= P(A) + P(B)
but (AB) = S
 P(AB) = P(S) = 1
 P(A) + P(B) = 1
Theorem 4 - Addition Law - If A and B are mutually exclusive
events them -

## Probability & Distributions / 115

n( A  B) n( A)  n( B)
P( A  B)  
n( S ) n( S )

( A  B    n ( A  B )  n( A)  n( B ))

n( A) n ( B)
 
n( S ) n( S )

 P ( A)  P ( B )

## then P(A1, A2, .......... Ak)

= P(A1) + P(A2) + .............P(Ak)

k
  p( Ai)
i 1

## Theorem 5 - For any two events A and B,

P (A - B) = P(A) - P(AB)
Proof - Let A and B are two events

A-B

A B AB

 (A-B)  (AB) = 

(A-B)  (AB) = A

## = P( A - B)  P(AB) (Using Theorem 4)

 P( A - B)  P(A) - P(AB)]

Theorem 6 - Addition Law - For any two events A and B

## A and B are any two events.

A B AB

A-BB=
A-BB=AB
A-B
 P(AB) = P[(A - B) B]
= P(A - B) + P(B)
= P(A) - P(AB) + P(B) A B
 P(A or B) = P(AB) = P(A) + P(B) - P(AB) AB
Theorem 7 - Addition Law for three events -

A B C

## Probability & Distributions / 117

BC = D ABC = AD

Let BC = D
Then P(ABC) = P(AD) = P(A) + P(D) - P(AD) ……. (1)
…… (by Theorem (1))
But
AD = A (BC)
= (AB)  (AC)
P (AD) = P [(AB)  (AC)]
= P (AB) + P (AC) - P [(AB)  (AC)] ……….(2)
[…….By Theorem 6]
= P (AB) + P(AC) - P(ABC)
and P(D) = P(BC) = P(B) + P(C) - P(BC) ……..(3)
using (1), (2), and (3) we have
 P (ABC) = P(A) + P(B) + P(C) - P(AB) - P(BC) - P(AC) +
P(ABC)
Corollary - IF A, B, C are mutually exclusive events.
Then
P(AB) = P(A) + P(B)
P(BC) = P(B) + P(C)
P(AC) = P(A) + P(C)
 P(ABC) = P(A) + P(B) + P(C)

Theorem 8 - For each event A, P( A ) = 1 - P(A), Where A is a
complementary event of A.

## then P(A)  P(B)

A B

Proof - Given A  B

B-A

 B = A  (B - A)
and A  (B-A) = 
 P(B) = P[A  (B-A)]
= P(A) + P(B - A) (by Theorem 4)
 P(A)  P(B)  P(B-A)  0
Theorem 10 - If A is an event associated with a random experiment,
then 0  P(A)  1
Examples : 1

## 1) A and B are mutually exclusive events for which P(A) = 0.3,

P(B) = P and P( A  B)  0.5 . Find the value of P..

## Probability & Distributions / 119

 P(AB) = 0
P(AB) = P(A) + P(B)
0.5 = 0.3 + P
P = 0.5 - 0.3
= 0.2
Examples 2 - In a class of 25 students with roll numbers 1 to 25 a
student is picked up at random to answer a question. Find the probability
that the roll number of the selected student is either a multiple of 5 or 7.

## Solution - Let A - bea event multiple of 5

B - bea event of numbers multiple of 7
 S = { 1, 2, 3, 4, ………..23, 24, 25}
A = {5, 10, 15, 20, 25}
B = {7, 14, 21}
 n (S) = 25
n (A) = 5
n (B) = 3

n( A) 5 1
 P ( A)     0.2
n(S ) 25 5

n( B ) 3
P( B )  
n(S ) 25

## event A and B are mutually exclusive

 AB = 
 P(AB) = 0
 P(AB) = P(A) + P(B)

5 3
 
25 25
53

25
8

25

Answer - Probability of student selected at random for answer a question
8
is
25
.

## Examples 3 In a group of 100 people 36 take tea , 45 take coffee and

20 take both tea and coffee. Find the probability that a per-
son selected at random
1. Takes Tea
2. Takes Coffee
3. Takes Tea or Coffee
4. Takes Tea and Coffee both
5. Takes neither Tea nor Coffee
6. Takes only Tea
7. Takes only Coffee
8. Takes only one drink
9. Takes Tea if we know that the person takes coffee
10. Takes coffee if we know that the person takes tea.

T C

16 20 25

39

## Event T: Person selected at random takes Tea

Event C: Person selected at random takes coffee

Now,

## n ( S )= 100 ( Total number of persons in the group

n(T) = 36 ( Number of persons taking tea )
n ( C )= 45 ( Number of persons taking coffee )

## Probability & Distributions / 121

n (S) = 100 (Total number of persons in the group)
n (T) = 36 (Number of persons taking tea)
n (C) = 45 (Number of persons taking coffee)
n (T  C) = 20 (Number of persons taking tea and coffee both)

N (T U C) = n (T) + n (C) – n (T  C) (Number of people taking tea or
coffee or both)
= 36 + 45 – 20 = 61
n (T U C)’ = n (S) – n (T U C) (Number of people neither taking tea nor
coffee)
= 100 – 61 = 39

## 1. P (Person Takes Tea)

P (T) = n (T)/n(S) = 36/100 = 0.36
2. P (Person does not take Tea)
P (T’) = 1- P (T) = 1 - 0.36 = 0.64
3. P (Person Takes Coffee)

45
P(C) = n (C) /n (S) = = 0.45
100
4. P (Person Takes Tea or Coffee)

61
P (T U C) = n (T U C) / n(S) = = 0.61
100
5. P (Person Takes Tea and Coffee)
P (T  C) = n (T  C) n (S) = 20/100 = 0.20
6. P (Person neither Takes Tea nor Coffee)
P (T U C) = n (T U C)’ / n (S) = 39/100 = 0.39
Or P (T U C) = 1 – P (T U C) = 1 – 0.61 = 0.39
7. P (Person Takes only Tea)
i.e P (Person takes tea and not coffee)
P (T – C) = P (T  C’) = P (T) – P (T  C) = 0.36 – 0.20 = 0.16

8. P (Person Takes Tea)
i. e P (Person takes coffee and not tea)
P (C – T) = P (C  T ) = P (C) – P (T  C) = 0.45 – 0.20 = 0.25
9. P (Person Takes only one drink)
i. e . P (Person takes only coffee or only tea)
P (C – T) + P (T – C) = 0.160 + 0.25 = 0.41
or P (T) + P (C) – 2 x (T  C) = 0.36 + 0.45 – 2 x 0.20 = 0.41
10. P (Person Takes Tea if we know that the person takes coffee)
i. e pick up a tea taking person from group of coffee drinking
persons
P (T/C) = n (T  C) / n (C) = 20/45 = 4/9

## or P(T/C) = P (T  C) / P (C) = 0.20/0.45 = 4/9

11. P (Person Takes Coffee if we know that the person takes Tea)
i. e pick up a coffee taking person from group of tea drinking
persons
P (C/T) = n (T  C) / n (T) = 20/36 = 5/9
or
P (C/T) = P (T  C) / P (T) = 0.20/0.36 = 5/9

## Ex 6 Find the probability that a card drawn from a pack of card is

a black card or an ace
Solution:
Total Cards = 52
Black Cards = 26
Aces = 4
 Let A be the event or drawing a black card
n (A) = 26C1 = 26

n (A) 26
So, P (A) = =
n (S) 52
n (S) = 52 (Total sample space)

n (B) = 4C1 = 4

## Probability & Distributions / 123

n (B) 4
So, P (B ) = =
n (S) 52
n (S) = 52
Now 2 aces are black, that is they are common to both Black
cards and aces.

 n (A  B) = 2
So, P (A  B) = n (A  B) = 2
n (S) = 52 (Total no. of cards in a pack)

## Hence P (A or B) = P (A U B) = P (A) + P (B) – P (A  B)

26 4 2
= + -
52 52 52
26  4 - 2 28 7
= = =
52 52 13

[ Note : A U B = A or B, A  B = A and B ]

## Ex 7 An urn contains 13 balls numbering form 1 to 13, Find the

probability that a ball selected at random is a ball with a
number that is multiple of 3 or 4.

Solution

n (S) = 13
Let A be the event that ball selected is with a number that is
multiple of 3,
i. e 3,6,9,12
n (A) = 4

4
P (A) =
13
Let B be the event that ball selected is with a number that is multiple
of 4, i. e. 4,8,12 n (B) = 3

3
P (A) =
13

and n (A  B) = 1 (There is only one number which is multiple of both
3 and 4 Which is 12)

1
P (A  B) =
13

P (A U B) = P (A) + P (B) – P (A  B)

4 3 1 6
= + - =
13 13 13 13

## Ex 8 A bag contains 20 tickets marked from 1 to 20. One ticket

is drawn at random. Find the probability that it will be a
multiple of 2 or 5 .
Solution:

N (S) = 20
Let A be the event that ticket drawn is a multiple of 2
i.e. 2,4,6,8,10,12,14,16,18,20
n (A) = 10

10
P (A) =
20
Let B be the event that ticket drawn is a multiple or 5
i.e. 5,10,15,20
n (B) = 4

4
P (B) =
20
n (A  B) = 2 (2 numbers are common to both events A and B)

2
P (A  B ) =
20
P (A U B) = P (A) + P (B) – P (A  B)

10 4 2 12 3
= + - = =
20 20 20 20 5

## Ex 9 In an examination, 30% of the students have failed in Math-

ematics, 20% have failed in Chemistry and 10% have failed

## Probability & Distributions / 125

in both Mathematics and Chemistry. A student is selected
at random.
i) What is the probability that the student has failed in Mathemat-
ics if it is known that he has failed in Chemistry?
ii) What is the probability that the student has failed in Mathemat-
ics or Chemistry?

Solution:

## Event A: Student failed in Maths

Event B : Student failed in Chemistry

30
P (Failed in Maths) = = P (A)
100

20
P (Failed in Chem) = = P (B)
100

10
P (A  B) =
100

P( A  B)
P(A/B) =
P(B)

10
100
= 20
100

10 100
= 
100 20

1
=
2
ii ) either in Maths or in Chem
(A U B) = ?
P ( A U B) = P ( A) + P (B) – P (A  B)
30 20 10
= + -
100 100 100

30  20  10
=
100

50  10
=
100

40
=
100

40
P (A U B) =
100
= 0.40
Ex 10 The probability that a contractor will get a plumbing con-
tract is 2/3, and the probability that he will not get a electric
contract is 5/9. if the probability of getting at least one con-
tract is 4/5, what is the probability that he will get both the
contracts.
Solution

## Event A = will get plumbing contract

2
P (A) =
3
Event B = will get electric contract

5
P (B’) =
9

4
Probability of getting at least one contract i.e. P (A U B) =
5
Probability of (A  B) =?

5 9 5 4
P (B) = 1 - = =
9 9 9

P( A  B) = P(A)+P(B)-P(A  B)

2 4 4
= + -
3 9 5

42
=
135
Ex 11 An urn contains 7 black and 5 white balls. Two balls are
drawn at random one after another. Find the probability that
both balls drawn are black if :
Probability & Distributions / 127
i) when first ball drawn is not replaced before drawing the second
(such drawing is called without replacement) and
ii) when first ball drawn is replaced before the second ball (such
drawing is called with replacement)

Solution

Black Balls = 7
White Balls = 5
Total balls = 7 + 5 = 12
So
n (s) = 12C2
12  11
= = 6  11 = 66
1 2
i) When first ball drawn is not replaced before drawing the second
(such drawing is called without replacement) In such cases we
find the probability by usual method 2 black balls can be drawn
out of 7 black balls in 7C2 ways.
7!
7
n (A) = C2 =
(7  2 ) ! 2 !

5 ! 6  7
= = 21
5 ! 1  2
7
C2
P (A) = 12
C2

21
=
66
7
=
22

ii) When first ball drawn is replaced before the second ball (such
drawing is called with replacement)
A = 1st ball drawn is black
i.e. We consider the event in two steps.
For first step 1 black ball is to be drawn out of 7 black balls
n (A) = 7C1 = 7

And for n (S) 1 ball is to be drawn out of total 12 balls
n (S) = 12C1 = 12

n(A)
P (A) =
n(B)

7
=
12
B = 2nd ball drawn is black when first ball is replaced.
At this stage the ball drawn is put back into the urn.
We, therefore have again same situation i.e. 7 black balls and
12 total balls – for second step 1 black ball is to drawn again out
of 7 black balls as earlier ball is replaced.
N (B) = 7C1 = 7
And for n (S) 1 ball is to be drawn out of total 12 balls
n (S) = 12C1 = 12

n(A)
P (B) =
n(B)

7
=
12
Since A and B are independent events

7 7 49
P (A  B) = P (A) P (B) =  =
12 12 144

## Ex 12 Three balls are drawn at random from a bag containing 6

blue and 4 red balls. What is the chance that 2 balls are
blue and 1 is red?

Solution

## Event A : 2 blue balls and 1 red ball is drawn.

3 balls can be drawn out of 10 balls in 10C3 ways.

10.8.9
n (S) = 10C3 =
1.2.3
= 120
2 blue balls can be drawn out of 6 in 6C2 ways = 6.5 = 15 ways.

## Probability & Distributions / 129

1 red ball can be drawn out of 4 in 4C1 ways = 4 ways.
6
n (A) = C2 . 4C1 = 15.4 = 60

60
P (A) =
120

1
=
2
Ex. 13 A and B are independent events and P (A) = 1/3,P (B) = ¾
find P (AUB)

Solution

## A and B are independent events

1 3 1
P (A  B) = P (A) P (B) =  =
3 4 4
And
1 3 1
P ( A U B) = P (A) + P (B) – P (A  B) =  
3 4 4
1 2
= 
3 4
46
=
12
10
=
12
5
=
6

## Ex 13 Two persons X and Y appear in an interview for 2 vacancies

in the same post. The probability of X’s selection is 1/5 and
that of Y’s selection is 1/3. What is the probability?
i) both X and Y will be selected
ii) only one of them will be selected and
iii) none of them will be selected?

Solution
1
P (A) = P (X selection) =
5
1
P(B) = P(Y Selection) =
3
i. P (both X and Y are selected) = P(A and B) = P(A  B)

## P(A  B) = P(A) . P(B) … As A and B are independent

1 1
= 
5 3
1
=
15
i. P( only one of X and Y is selected) = P(only A) + P(only B)
P(only A) = P(A) – P(A  B)
= P (A) – [P(A) . P(B)]

1 11
= -  
5 5 3

1 1
= - 
5  15 

15  5
=
75

10
=
75
2
=
15
And
P (only B) = P(B) – P(A  B)
= P(B) – [P(A).P(B)]

1 11
= - 
3 5 3

1 1
= -
3 15
15  3
=
45
4
=
15
Probability & Distributions / 131
iii. None of them selected
p(x, y) bothe not selected )
= 1 - p ( x and y both selected )
1
= 1-
15
14
=
15
4.6 Baye’s Theorem
If A1,A2,A3 ….. An are mutually exclusive and exhaustive events
and B is any other which is spread over events A1, A2, A3…..An. Consider
that there are 3 containers containing balls of different colors. Then event
A1 is selection of container 1, event A2 is selection of container 2 and A3
is selection of container 3. Thus it can be seen that events A1, A2 and A3
are mutually exclusive and exhaustive. If we define events B as drawing a
yellow ball then yellow ball can be selected from container 1 or 2 or 3.
Then we say that event B is spread over events A1, A2 and A3. And if we
know that the ball drawn is yellow then probability that the ball is se-
lected from a particular container is given by

P( Ai)P(B )
Ai
P( A1 ) = P( A1)  P( B )  P( A 2 )  P( B )  ...P( A i )  P( B )  ...P( A n )  P( B )
B A1 A2 Ai An
Alternatively ...
A( Ai  B)
P( Ai ) =
B A( A 1  B)  P( A 2  B)  .....  P( Ai  B)  ......  P( A n  B)

## Ex 15 Urn- 1 contains 5 red and 5 blackballs, urn2 contains 4 red

and 8 black balls, urn3 contains 3 red and 6 black balls.
One urn is chosen at random and a ball is drawn. The color
of the ball is black. What is the probability that it has been
drawn from urn3 ?
Solution:
Let A1 be the event of selection of urn-1
Let A2 be the event of selection of urn-2
Let A3 be the event of selection of urn-3
Let Event B = black ball is drawn.
Total urns are 3 therefore ,

1
P(A1) =
3

1
P(A2) =
3
1
P(A3) =
3
We know that the ball drawn is black and we are interested to
know the probability that it has come from urn 3 i.e. we want to know
P(A3/B)
Now by bayes’s theorem.

P( A 3  B)
P(A3/B) = P( A 1  B )  P( A 2  B )  P( A 3  B )

## Urn -1 contains 5 black balls and 10 total balls. Therefore, prob-

ability of drawing black ball from urn1 is (That is we know that urn 1 is
selected and we want to know the probability of drawing black ball).

## Black balls in urn 1

P(B/A1) =
Total balls in urn 1

5 1
P(B/A1) = =
10 2

Similarly ,
8 2
P(B/A2) = =
12 3

6 2
P(B/A3) = =
9 3

1 1 1
P(A1  B) = P (B/A1). P(A1) =  =
3 2 6

1 2 2
P(A2  B) = P (B/A2). P(A2) =  =
3 3 9

1 2 2
P(A3  B) = P (B/A3). P(A3) =  =
3 3 9

## Probability & Distributions / 133

Therefore,
P( A 3  B)
P(A3/B) = P( A 1  B )  P( A 2  B )  P( A 3  B )

2
9
= 1 2 2
 
6 9 9

2
9
= 33
54

2 54
= 
9 33
4
=
11

## Ex 16 Suppose there is a chance for a newly constructed building

to collapse whether design is faulty or not. The chance that
the design is faulty is 10%. The chance that the building
collapses is 95% if the design is faulty and otherwise it is
45%. It is seen that the building collapsed. What is the
probability that it is due to faulty design?

Solution

## Event A1; Design is Faulty

Event A2: Design is not faulty.
Event B = building collapses.

## P(A1) = 10/100 = 0.1

(i.e. design is faulty)
So probability that design is not faulty is

## P(A2) = 1 – 0.1 = 0.90.

Now,
P(B/A1) = 0.95 (Building Collapses when we know that design is faulty)

P(B/A2) = 0.45 (Building Collapses when we know that design
is not faulty).
We want to find the probability of Design is faulty when we know
that building collapsed. That is we are interested in P(A1 / B). By Baye’s
Theorem.

P(A1 )P( B )
A1
P(A1/B) = P(A1 )P( B )  P(A 2 )P(B )
A1 A2

P(A1  B)
= P(A1  B)  P(A 2  B)

10 95

100 100
P(A1/B) = 10 95 90 45
  
100 100 100 100

0.095
=
0.95  0.405

0.095
=
0.5
= 0.19

## Mathematical expectation is average result obtained when the

experiment is repeated large number of times e.g. when toss a coin for 4
times the possible number of heads are 0, 1, 2, 3, 4. If this experiment is
repeated for say 100 times how many numbers of heads are expected ?
To make such calculations we will calculate mathematical expectation
by probability theory and then multiply the result obtained by 100 to know
expected number of heads in throwing 4 coins 100 times.

## Mathematical expectation is denoted by E(x). the following ex-

ample makes it clear.

## Ex 17 Find Expected value of heads and its variance when a coin

is tossed for 3 times.

## Probability & Distributions / 135

No of heads (x) 0 1 2 3 Total
Outcomes
Probability (P) 1 3 3 1 1
8 8 8 8
Px (row 2 x row 1) 0 3 6 3 12
= 1.5
8 8 8 8
Px2 (row 2 x 0 3 12 9 24
square of row 1) =3
8 8 8 8
Solution

## E ( x ) =  Px = 1.5 E ( x ) is the mean or average value and is

denoted by m.
E( x ) =  Px2 = 3
Variance (x) = E ( x 2) – m 2

= 3 – (1.5) 2
= 3 – 2.25 = 0.75

## S.D (  ) = 0.75 = 0.8660

Note that
Variance = E(x2) – m2
= E (x2) – [E(x)]2
=  Pi xi2 – [  Pi xi ] 2
Where
X = number of outcomes
P = Probability of that outcome.

## Ex 18 Find mathematical expectation of the number of points if a

balanced dice is thrown. Find also standard deviation.

Solution

X 1 2 3 4 5 6

Probability 1 1 1 1 1 1
6 6 6 6 6 6

1
i.e. Probability of each is
6
Solution :
1 1 1 1
 P ix i = (1 x ) + (2 x ) + (3 x ) + (4 x ) +
6 6 6 6
1 1 1 2 3 4 5 6
(5x )+ ( 6 x ) + + + + +
6 6 6 6 6 6 6 6

21 7
 P ix i =
6
= = m = E(x)
2

1 1 1 1 1 1
 Pix2i = (12 x ) + (22 x ) + (32 x ) + (42 x ) + (52 x ) + ( 62 x )
6 6 6 6 6 6

1 4 9 16 25 36
= + + + + +
6 6 6 6 6 6

91
 Pix2i =
6
= 15.16 = E(x2)

## Variance (x) = E(x2) – m2

= 15.16 – (3.5)2
= 15.16 – 12.25 = 2.91

## Ex 19 A man is to play a game as follows:

In three tosses of balanced coin, he will get a reward of Rs.
20,000, Rs. 10,000, Rs. 5000 and no reward if he gets three
tails, two tails, one tail and no tail respectively. The en-
trance fee for the contest is Rs. 6000. Will he play the
game?

## No of tails (x) Outcomes 0 1 2 3 Total

Probability (P) 1 3 3 1 1
8 8 8 8

8 8 8 8

## Probability & Distributions / 137

Expected value of reward =  Er = 8125
As expected earnings are more than the entrance fee, the per-
son will play the game.

## Ex 20 : A box contains 8 tickets, 3 o the tickets carry a prize of

Rs. 5 each and other 5 a prize of Rs. 2.
i. If one ticket is drawn what is the expected value of the prize.
ii. If 2 tickets are drawn what is the expected value of the game?

Solution

## Event A : Ticket Drawn carries prize of Rs. 5

Event B : Ticket Drawn carries prize of Rs. 2
3 tickets have Prize Rs. 5
5 tickets have Prize Rs. 2

Total Tickets = 8 T

n (S) = 8 C 1 = 8
n (A) = 3 C 1 = 3
n (B) = 5 C 1 = 5

3 5
C1 C1
P (A) = 8 P (B) = 8
C1 C1

3 5
P (A) = , P (B) =
8 8

## Expected value of Prize =

3 5
 Pi Xi = (5 x ) + (2 x )
8 8

15 10 25
= + =
8 8 8

 Pi Xi = 3.125 = m = E (x)

ii) Two tickets are drawn

C1 / 8C2 = 3 28

C1 / 8C2 = 10 28

## One Carries prize 5 Rs 7 3

C1 5C1/8C2 = 15 28

 Pix i = (10 x 3 28 ) + (4 x 10 28 ) + (7 x 15 28 )

## Ex 21 An article manufactured by a company has to parts A and

B. In the process of manufacture of part A. 9 out of 100 are
likely to be defective. Similarly 5 out of 100 are likely to be
defective in manufacture of part B. Calculate the probability
that assumed part will not be defective.

Solution

A B
Defective P (A) = 9/100 Defective P (B) = 5/100
Good P (A)’ = 91/100 Good P(B)’ = 95/100

So assumed part is not defective will be A’ and B’.
i.e. is not defective A’  B’.

Therefore P ( A’  B’ ) = 91  95
100 100

= 8645
10000
= 0.8645

## Ex 22 The probability that X can solve a problem in Business Sta-

tistics is 3/4, that Y can solve is 2/5, that Z can solve is
5 / 9. If they all try independently, find the probability that
problem will be solved.

Solution:

## P (A) = X solves the problem = 3/4

P (B) = Y solves the problem = 2/5
P (C) = Z solves the problem = 5/9
P (S) = 1
P (A’  B’) + P (A U B) = P (S) = 1
Similarly
P (A’  B’  C’) + P (A U B U C) =1
Now, P (A solves problem or B solves problem or C solves problem)
= P (A U B U C) =1 - P (A  B  C)
P (A’) = X does not solve the problem =
P (A’) = 1 – P (A)
= 1 – 3/4 = 1/4
P (B’) = Y does not solve the problem =
P (B’) = 1 – P (B)
= 1 – 2/5 = 3/5
P (C’) = Z does not solve the problem =
P (C’) = 1 – P (C)
= 1 – 5/9 = 4/9
Events A,B,C are independent. Therefore events A, B and C are also
independent.
P (A’  B’  C’) = 1/4 x 3/5 x 4/9 = 1/15
P (A solves problem or B solves problem or C solves problem)
P (Problem is solved)
= P (A U B U C) = 1 – P (A’  B’  C’) = 1- 1/15 = 14/15

## Ex 23 Three machines A, B, C produce respectively 50%, 30%

and 20% of the total number of items of a factory. The per-
centage of defective outputs of these machines are respec-
tively 3%, 4%, 5%. If an item is selected at random. What’s
the probability that the selected item is defective.

Solution.

## Let, us define the following events.

A: - Item produced on machine A
B: - Item produced on machine B
C: - Item produced on machine C
D: - Defective selected item
Then we get P (A) = 0.50
P (B) = 0.30
P (C) = 0.20
P (D/A) = 0.03; P (D/B) = 0.04; (D/C) = 0.05
P (D) = P (D  A) + P (D  B) + P (D  C)
= P (D/A)  P (A) + P (D/B)  P (B) + P (D/C)  P (C)
= 0.50  0.03 + 0.30  0.04 + 0.20  0.05
= 0.015 + 0.012 + 0.010
= 0.037

## Binomial Distribution is also known as the ‘Bernoulli distribution’

after the swiss Mathematician James Beroulli (1654-1705). This distri-
bution can be used under the following conditions-

## i. The random experiment is performed repeatedly a dinite and

fixed number of times. In other words n - the number of trials, is
finite and fixed.
ii. The outcome of the random experiment (trial) results in the clas-
sification of events.

## A - The occurence of event = Success.

A = A’ - The non-occurence of event = Failure.

## Two of the most widely used discrete probability distributions are

the binomial and Poisson.

## Probability & Distributions / 141

4.8.1 Properties of binomial distribution

## 1. A set of n identical trials e.g. throw of a fair coin for 4 times.

Here n = 4
2. Each trial has only two outcomes. Success or failure. In above
experiment, if we decide getting head us success the probabil-
ity of success is probability of getting head. It is denoted by p.
Thus p = 1/2 and probability of failure is probability of not getting
head. It is denoted by q and q = 1 – p = 1 – 1/2 = 1/2 n and p
called parameters of binomial distribution.
3. maximum number of success = n number of outcomes in a set
is 1 + n. in above case, when coin is tossed 4 times, we might
i.e. there are 5 outcomes. This distribution can be represented
as x ~ B (n, P) where x is a binomial variable which takes values
from 0 to n. in this case the distribution is represented as
1
x ~ B (4, ).
2
Probability of success is the same for all trials.
4. Outcome of earlier trials does affect the outcome of later trials.
5. Probability of getting x success in a set of n trials is given by

45r4f (x) = n C x p x q (n - x) . This is called as probability mass func-
tion (PMF) e. g. probability of getting 3 heads in above experi-
ment is
4
f (3) = C 2 p 3q (4 -3)

3 1
1 1
= 4   
2 2

4 1
= =
16 4
6. Sum of all probabilities is 1. i.e. f (0) + f (1) + f (3) = f(4) = 1 i.e.
 f(x) = 1
7. Mean or expected value of x E (x) of binomial experiment is
m = np.
1
In this case expected value of heads E (x) = m = 4  =2
2

1 1
8. Variance of x V(x) = npq. In this case V (x) = 4   =1
2 2

## 9. Standard deviation = V( x ) = npq = 1.

10. The most likely value mode of x is given by the largest integer
less than or equal to (n + 1) p; if m = (n + 1)p is itself an integer,
then m – 1 and m are both modes.
11. Sums of binomials
If x ~ B (n, p) and y ~ B (m, p) are independent binomial vari-
ables, then z = x + y is again a binomial variable then its distri-
bution is z ~ B (m + n, p)

## In probability theory and statistics, the Poisson distribution is a

discrete probability distribution. It expresses the probability of a number
of events occurring in a fixed period of time if these events occur with a
known average rate, and are independent of the time since the last event.

## The distribution was discovered by Siméon – Denis Poisson

(1781 -1840) and published, together with his probability theory, in 1838.

## Probability & Distributions / 143

Poisson
Probability mass Function

The horizontal axis is the index x. The function is only non – zero
at integer values of m. The connecting lines are only guides for the eye
and do not indicate continuity.

## 1. It is characterized by an average rate of occurrence of an event

over a fixed interval. It can also be characterized by an average
rate of occurrence of an event over a fixed lot. e. g. mean arrival
rate of customer in the shop which means average number of
customers visiting a shop per hour where time interval is defined
as 1 hr.
2. There are only two outcomes, success or failure. In above case
arrival of customer can be defined as a success and non arrival
as a failure.
3. It is a limiting case of binomial distribution where probability of
success is low.
4. Average number of success is denoted by m.
5. Poisson distribution has only one parameter which is m
6. distribution can be represented as x ~ Poi(m) where x is a
binomials variable which takes values from 0 onwards.
7. Probability of success in any interval is independent of outcomes
of earlier intervals.
8. Probability of getting x success in a set of n trials is given by
f (x) = e –m mx / x! This is called as probability mass function
(PMF) where e = 2.71828
e.g. If in above case if average rate of arrival for customers is 2
customers per hour then m = 2 and probability of getting 3 cus-
tomers in an hour is
f (3) = e –m mx / x!
= e -223 /3!
= 4 (1/2)3 (1/2)
= 4/16 = 1/4
9. Sum of all probabilities is 1. i.e. f (0) + f (1) + f (2) + f (3) + f (4)
… = 1 i.e  f(x) = 1
10. Mean or expected value of x, E (x) of Poisson experiment is m.
11. Variance of x, V (x) = m. in this case V (x) = 2

## 12. Standard deviation = m = 2.

13. The mode of a Poisson – distributed random variable with non –
integer m is equal to the largest integer less than or equal to m.
When m is positive integer, the modes are m and m -1
14. Sums of Poisson’s:
If x ~ Poi(m) and y ~ Poi(n) are independent Poisson variables,
then z = x + y is again a Poisson variable then its distribution is
z ~ Poi(m + n )

## 4.9.2 Examples of events that can be modelled as

Poisson distributions include:

##  The number of cars that pass through a certain point on a road

(sufficiently distant from traffic lights) during a given period of
time.
 The number of spelling mistakes a secretary makes while typing
a single page.
 The number of phone calls at a cell centre per minute.
 The number of times a web server is accessed per minute.
 The number of stars in a given volume of space.

## Probability & Distributions / 145

4.10 Normal Distribution

## The normal distribution, also called Gaussian distribution (named

after Carl Friedrich Gauss, a German mathematician, although Gauss
was not the first to work with it), is a probability distribution of great
importance in many fields. It is a family of distributions of the same gen-
eral form, differing in their location and scale parameters the mean (“aver-
age”) and standard deviation (“variability”) respectively. The standard nor-
mal distribution is the normal distribution with a mean of zero and a
variance of one. It is often called the bell curve because the graph of its
probability density resembles a bell.

## Probability density function

Cumulative distribution function

## Normal distributions are a family of distributions that have the

same general shape. They are symmetric with scores more concentrated
in the middle that in the tails. Normal distributions are sometimes de-
scribed as bell shaped.

## One reason the normal distribution is important is that many

psychological and educational variables are distributed approximately nor-
mally. Measures of reading ability, introversion, job satisfaction and memory
are among the many psychological variables approximately normally dis-
tributed, although the distributions are only approximately normal. They
are usually quite close.

## A second reason the normal distribution is so important is that it

is easy for mathematical Statisticians to work with. This means that many
kinds of Statistical tests can be derived for normal distributions. Almost
all statistical tests discussed in this text assume normal distributions.
Fortunately, these tests work very well even if the distribution is only
approximately normally distributed. Some tests work well even with very
wide deviations from normality.

## Examples of normal distributions are shown below. Notice that

they differ in how spread out they are. The area under each curve is the
same. The height of a normal distribution can be specified mathemati-
cally in terms of two parameters: the mean (  ) and the standard devia-
tion (  ).

## Probability & Distributions / 147

4.10.1 Properties of Normal Distribution

## 1. It is continuous distribution where probability of individual occur-

rence is zero. But it specifies the probability of an observation
lying in a certain range.
2. It is a symmetric distribution about mean. The number of obser-
vations go on increasing up to the mean and after crossing the
mean, number of observations go on reducing in the exactly
same way as they increased up to the mean, i.e. Number of
observations are maximum at mean and a vertical line drawn at
the mean divides the distribution into two parts which exactly
match each other.
3. 50% observations are above the mean and 50% observations
are below the mean.
4. In normal distribution, mean = mode = median.
5. Average number of success is denoted by m.
6. Poisson distribution has only one parameter which is m.
7. This distribution can be represented as x ~ N (  ,  2) where x is
a normal variable which takes values from – infinity to + infinity.
8. Probability of getting value of observation up to x is given by

1
F(X) = e  ( X  µ )2
2πσ 2
This is called as probability density function (PDF) where  is
the mean and  is the standard deviation,  is the constant
3.14159, and e is the base of natural logarithms and is equal to
2.718282. x can take on any value from – infinity to + infinity.
9. Total probability of the whole area under the curve = 1.
10. Mean or expected value of x, E (x) of normal distribution is  .
11. Variance of x, V (x) =  2 . In this case V (x) = 2

## 12. Standard deviation = σ

13. The mode of normal distribution = 
14. The inflection points of the curve occur at one standard devia-
tion away from the mean, i.e. at  –  and  + 
15. Sums of normal distributions
If x ~ N (  1.  12) and y ~ (  2,  22) are independent normal
variables, then r = x + y is again a normal variable then its
distribution is r ~ N (  1+  2,  12 +  2 2 ) if r = x – y, then r ~
N (  1 -  2,  12 +  2 2 )

## Relation between various Statistical parameters for normal

distribution.

µ–σ µ μ+σ

µ – 2σ µ μ + 2σ

 68.27% of the area under the curve is within one standard devia-
tion of the mean. ( 1  range i.e.  +  )
 95.45% of the area is within two standard deviations. ( 2 
range i.e  ± 2  )
 99.73% of the area is within three standard deviations. ( 3 
range i.e.  ± 3  )
 99.99% of the area is within four standard deviations. ( 4 
range i.e.  ± 4  )
 99.9999% of the area is within five standard deviations.
(5  range i.e.  ±5 )
 99.999999% of the area is within six standard deviations.
(6  range i.e.  ± 6 )
 99.999999999% of the area is within seven standard deviations.
(7  range i.e.  ± 7 )

## 1) _______________ is a symmetric distribution about mean.

2) In normal distribution mean = _______________.
3) PDF _______________.
4) Total probability of the whole area under the curve is equal to
_______________.
5) Values of all three mean, mode, median are equal then distribu-
tion is _______________.

## It is possible to relate all normal random variables to the stan-

dard normal.

If X ~ N (  ,  2 ), then

X
Z =

## Thus standard normal distribution has

mean = mode = median = 0
Variance = Standard Deviation = 1

4.11.1 Relations between various parameters for a normal
distribution

##  average deviation / standard deviation = 0.7979

 standard deviation / average deviation = 1.2533
 probable error / standard deviation = 0.6745
 probable error / average deviation = 0.8453
 probable error / average error = 0.8453
 average error/probable error = 1.183
 probable error /standard deviation = 0.6745
 standard deviation /probable error = 1.4826
 IQR = 1.35 X 
 The first quartile of any normal distribution is located below the
mean and the third quartile is 0.67  above the mean.

4.12 Summary

## Probability is theory of chance. The value is in between closed

interual zero and one. That mean both values zero and one included.
The probability value is almost a fraction, but always positive. Distribu-
tion means the expansion, variance of our given data about the central
tendencies like Mean, Mode and Median. With help of that there are
types of distribution like Normal, Skewed, Poisson.

4.2

1) 1 and 0
2) Experiment
4) Sample space
5) Dependent event

## Probability & Distributions / 151

4.4

1) 1
2) 1

n!
3)
r !( n  r ) !
4) 1
5) n
4.10

1) Normal distribution
2) = Median = mode
3) Probability density function
4) One
5) Normal

## 4.14 Questions for Self - Study

1) Short Notes –
a) Normal Distribution
b) Probability
2) What is Poisson distribution?
3) Write properties of Poisson distribution?
4) Explain the term, ‘Expected value’
5) Where can be Baye’s Theorem applied?


  

NOTES

## Probability & Distributions / 153

NOTES

CHAPTER – 5

INDEX NUMBERS

5.0 Objectives
5.1 Introduction
5.2 Price and Quantity Relatives
5.3 Price and Quantity Index, Numbers
5.4 Laspeyre’s, Paasche’s Index Numbers
5.6 Illustration
5.7 Various Index Numbers
5.8 Consumer price index
5.8.1 Calculating a consumer Price Index
5.9 Summary
5.11 Questions for Self - Study

5.0 Objectives

## After studying the concept of a Index Numbers, students can get

the idea of calculations and proper decisions of economics of business
and get following –
 Price and quantity
 Numerical to calculate Price Index
 Base of weighted Index Number

5.1 Introduction

## Index numbers gives us the comparison of changes in numerical

value for two different time periods. It is given in terms of percent relative
change.
Let us consider price index, Relative change is the ratio of prices
of one period to the prices of the other period. e.g. If prices of wheat in the
year 1990 and 1999 were Rs. 5/- per kg and Rs. 9/- per kg then the
relative change is 9/5 and the price index of 1999 for wheat with respect to
Index Numbers / 155
the base year 1990 is given by ( 9 )  100 = 180, which is percent rela-
5
tive change. Thus period with which comparison is made is called as
base period. The index number is independent of unit used for compari-
son. However, prices of the both periods must be expressed in the same
units.

## It tells us that if we treat the price of one unit of wheat in 1990 as

100 then in 1999 the price has grown to 180 and it shows that there is
80% increase in prices of wheat in 1999 compared to prices of wheat in
1990.

## Index numbers can be calculated for comparisons of Production

Volume, GNP, Price, Quantity, Expenditure etc.

## We need to combine index numbers for all the commodities to

arrive at general picture of the prices of two periods.

## 5.2 Price and Quantity Relatives

If we let P0 be the price in the base period and let PN be the price
in the later period, then the price relative for the price change between
 PN 
these periods is given by  P   100 .
 O
Price Relative is given by:
Price of one commodity in the current year
 100
Price of the same commodity in the base year

 PN 
=  P   100

 O 

## Quantity of one commodity in the current year

 100
Quantity of the same commodity in the base year

 QN 
=  Q   100
 O

5.3 Price and Quantity Index Numbers

## If we are interested in knowing the effect of all commodities in a

single index, we need to calculate the index number taking into account
all commodities. Such index numbers are called as aggregative index
numbers. There are two types in such index numbers: Simple aggregative
index numbers and weighted aggregative index numbers.

## Sum of prices of all commodities of current year

 100
Sum of prices of all commodities of the base year

## This index concentrates on measuring price changes from a base

year. It is called a base weighted index because we use the quantities
purchased in the base year (here 1990) to weight the unit prices in both
years. Keeping the quantities constant in this way means that any change
in the calculated expenditure is due solely to price changes.

## 5.4 Laspeyre’s Paasche’s Index Numbers

  PnQo 
The Laspeyre’s price is given by  P Q   100 .
 o o 

## In practice, the Laspeyre’s price index is usually calculated us-

ing price relatives. For this method, we have to use the expenditures in
the base year as weights. This sounds more complicated but the reason
we do this, is that it is easier to obtain data on expenditure that on actual
quantities bought when we are dealing with a large complicated index.
For example, cost of living weights are obtained by using sampling in the
Survey of House Expenditure. Indeed for some elements of the cost of
living expenses, ‘quantities’ don’t even make sense. You can’t really talk
about ‘quantities’ of public transport, for example.
Here is the general rule for working out the base weighted or
Laspeyre’s price index using price relatives.

P 
  N  100   POQ O
Laspeyre’s Price Index =  PO 
 PO QO

## Index Numbers / 157

Notice that cancelling the PO above and below on the top line and
taking out the factor of 100 gives us the same answer as before.

 PNQ O
 100
 POQ O
The end weighted or Passsche’s price index = (PnQn/ POQN)  100

## Laspeyre’s Quantity Index = ( QnPo/ QoPo) x 100

Passche’s Quantity Index = (QnPn/ QoPn) x 100

The base weighted index has the advantage that we only have to
work out the base year expenditures once. We can then use these in the
calculation of the index in any subsequent period. However, this index
can be misleading in telling us what is actually going on. For example,
the fluctuations in fashion might have a considerable impact on an index.
Suppose that skirts were considered as a separate item in a women’s
clothing manufacturer’s index. The greatly increased relative popularity
of trousers would dramatically affect the quantities sold and any index
which used base year quantities from some time back would be mislead-
ing. The next index that we consider avoids this particular problem.

and Laspeyre’s indices.

## Laspeyre’s Price Index is based upon the baseline quantities

only the current year’s prices are required for the calculation of the index,
for the Paasche’s Price Index then the current year quantities and prices
are required. Laspeyre’s Price Index is slightly easier to calculate and
can be obtained in situations where the current year’s quantities are un-
known.

## Passche’s Price Index uses current quantities so more relevant

and topical. Especially where the quantities may have changed dramati-
cally.

## Paasche’s Price Index has to be recalculated each period new

data comes along as the current period will have changed, substantial
quantity changes the previously calculated Paasche’s Price Indices may
take substantially different values from the newly calculated ones based
upon the new current year quantities. Laspeyre’s Price Index does not
need to be recalculated as the base period remains the same.

In summary Paasche Price Index is slightly more relevant better
interpretation as uses current quantities slightly more inconvenient to
calculate. If both quantities and prices are readily available then it is a
simple matter to calculate both and the combined index. Substantial
differences between the Laspeyre’s and Paasche Price Indices then this
is conveying information about the changes in prices and quantities.

## Expenditure is made up of two different elements, prices and

quantities bought. We’ll suppose first that we are particularly interested
in price changes overtime. In complicated situations, where we need to
compare the prices of many items over many different time intervals (such
as for the Retail Price Index) we work with the different prices, and use
the quantities to weight them in different ways for different index
numbers.

5.6 Illustration

## Price Index Calculation Weight : Base Weight Current Weight : Any

year quantity year quantity suitable quantity

## Commodity Price Rs/Kg Quantity (tons) POQ O PN Q O PO Q N PN Q N POW PN W W

1990 1999 1990 1999

Wheat 5 9 500 1200 2500 4500 6000 10800 3000 5400 600

Rice 4 10 600 1800 2400 6000 7200 18000 3200 8000 800

Cereals 7 14 400 900 2800 5600 6300 12600 2100 4200 300

## Total 7700 16100 19500 41400 8300 17600

Weight:Base Weight Current Weight : Any
Quantity Index Calculation
year quantity year quantity suitable quantity

## Total 7700 19500 16100 41400 12700 32700

Commodity Price Rs/Kg Quantity (tons) (PN /PO) 100 (QN/QO) 100

## Index Numbers / 161

1990 1999 1990 1999

## Total 16 33 1500 3900 630 765

QO Quantities of Base Year (1990)
QN Quantities of Current Year (1999)
PO Prices of Base Year (1990)
PN Prices of Current Year (1999)

W It is any standard value other than base and current year quan-
tities.

 PO = 16

 PN = 31

 QO = 1500

 QN = 3900

 POQO = 7700

 PO Q N = 19500

 PNQ O = 16100

 P NQ N = 41400

## 1. Simple average of relatives Price Index:

P 
  N P 100 630
 O = = 210
Number of Commodities 3

## 3. Lasperey’s Price Index : (  PNQO /  POQO) x 100

= (16100 / 7700) X 100 = 209.09

## 4. Lasperey’s Quantity Index = ( QNPO / QOPO) X 100

= (19500 / 7700) X 100 = 253.24

## 5. Passche’s Price Index = ( PNQN / POQN) X 100

= (41400 / 19500) X 100 = 212.30

6. Passche’s Quantity Index =( QNPN /  QOPN) X 100
=(41400 / 16100) X100 = 257.14

## 7. Fixed Weight Quantity Index = (  PNW /  POW) X 100

=(17600 / 8300) X 100 = 212.04

## 8. Fixed weight Quantity Index = (  QNW /  QOW) 100

= (32700 / 12700) X 100 = 257.48

## 9. Value or Expense Index = (  QNPN /  QOPO ) X 100

= (41400 / 7700) X 100 = 537.66

## There are several types of indices defined, among them those

listed in the following table.

## Index Abbr Formula

Bowley index PB 1
( P  PP )
2 L
Fisher index PF PL  PP

## Geometric mean index PG 1 Where,

 WP VO 
  Pn  
 VO
P1
  W
 P   P = P  100 & W  P0 P1
  o  0
Harmonic mean index PH [ p n qo /  ( po 2 qo / pn ] X
 100

## Laspeyre’s index PL [ pn qo /  poqo] X 100

Marshall-Edge worth index PME  100
[  p n (qo + qn) /  po ( qo + q n)] X

 100

## Paasche’s index PP [  p nqn /  po q n] X 100

Walsh index PW   pn qo qn 
   100
  p qo qn 

## Index Numbers / 163

One problem in the construction of any index number is choos-
ing a suitable base period. We want a base where prices ( or Volumes )
were not unnaturally high or low. An example of this would be if bad
weather had caused an extreme shortage in a particular crop which then
led to a very high price for it. Also, people are not happy with a base
period which is too far in the past. Furthermore, tastes and availability
can change a great deal over time so such an index could be seriously
misleading. One way sometimes used to avoid these problems is to use
a chain-based system where, in calculating successive index numbers,
the base used is the previous period. A chain-based index number is
particularly suited for period by period comparisons, but a fixed-based
index number makes it easier to compare the movement of prices over
time.

Fill in the blanks:
1. Index Numbers are called ________________ .
2. For Index Number period with which comparison is made is
called as ________________.
3. Po is the price in________________.
4. PN is the price in ________________.
5. The Laspeyre’s price Index Number given by ________________.
6. The Passche’s Quantity Index = ________________.
7. QO = ________________.
8. QN = ________________.
9. Price in base is denoted by ________________.
10. Quantity in current year denoted by ________________.

## Consumer price Index usually contains the following categories:

a) Food: cereals; meat and fish; fruits and vegetables; miscella-
neous food such as dairy products, sugar, tea, pies, etc.
b) Drinks, tobacco and betel nut: soft drinks ( treated waters and
cordials); alcoholic drinks, cigarettes tobacco, betel nut.
c) Clothing and footwear: men and boys’ clothing; women and girls’
clothing; other clothing such as nappies, accessories, etc; foot-
wear.

d) Rent, council charges, fuel and power: dwelling rentals; council
charges for water, sewerage and garbage disposal; electricity
and kerosene
e) Household equipment and operating: durable goods (e, g , sew-
ing machine, kerosene stove ): semi-durable goods ( e, g, sheets,
tableware) non-durable (e, g, matches, laundry soap, insecti-
cides)
f) Transport and communication; motor vehicle purchase; motor
vehicle operation (petrol, oil, repairs, parts, accessories, licenses
and insurance): airline, taxi, bus and public Motor vehicle ( PMV)
fares; telephone and postal charges.
g) Miscellaneous: medical and health care; entertainment and cul-
tural goods and services (e, g, sound equipment, newspapers
and magazines, cinema admissions, education fees): other goods
(e, g, items for personnel care, writing and drawing materials).

## We will now see how information on prices can be used to create

a weighted price index for the economy – this is the sort of data which is
then used to calculate the rate of inflation
Category Price Index (I) Weighting(W) Price x weight

100 10437

## Index Numbers / 165

The price index for each category shows what has happened to
the price level since a base year value. To generate a weighted price
index we multiply the price index for each category by its weight and
then sum these, We then divide by the sum of the weights (100) to find
an overall price index ( 104.37) or 104.4 rounded to one decimal place.
Here is some real world data on a selected of price indices for
goods and services.

## Year All Health Transport Communication Tobacco Clothing New Second

Items cars Hand Care
1996 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0

## 2003 109.8 124.2 116.9 84.5 158.8 66.8 95.7 85.1

Over the period1996-2003 there has been a 10% rise in the gen-
eral price level. But this hides major changes in average prices for differ-
ent products. The average cost of purchasing tobacco products has jumped
by nearly sixty per cent whereas the prices of clothing, second hand
cars and communication have been falling.

## Example 1: Calculate the Laspeyre’s, Paasche’s and Fisher’s

Index Numbers for prices in the year 2004 with 2009 where 2004 as nase
year, from following data.

## Commodity Base year 2004 Current year 2009

Price Quantity Price Quantity
PO QO P1 Q1

Price 4 15 6 20
Wheat 3 40 5 35
Jawar 5 20 5 25
Pulses 6 10 8 10

## Example 2: Calculate Index Number-

Commodity Base year Current year
Price Quantity Price Quantity

A 8 50 10 60
B 10 40 12 50
C 5 100 9 70
D 6 10 8 20

5.9 Summary

## An Index Number can be defined as the tool of statistics used for

measuring the relative changes in value of variables or of a group of re-
lated variables from one period to other place.
Index Number are also called as Economic Barometers. For
measuring the relative changes in price of a single commodity or a group
of commodities. We use Index Number.

5.7
1. Economic Barometers 2. Base year
3. Base year 4. Price in current year

 PNQo  100 

 QNPN  100
5. 6.
  PoQo    QoPN 

## 7. Quantity of base year 8. Quantity of current year

9. PO 10. QN

5.8
1. Laspeyre’s Index Number = 138.2
2. Paasche’s Index Number = 135.135
3. Fisher’s Index Number = 136.67.

## Write short Notes –

1) Index Number 2) Laspeyre’s Index Number
3) Advantages of Index Number 4) Consumer’s price Index.


 
Index Numbers / 167

NOTES

QUESTION BANK

## 1. A private research organization studying families in various coun-

tries reported the following date for the amount of time 4 year
old children spent alone with their fathers each day.
Time with
India 60
Belgium 30
China 54
Finland 50
Germany 36
Nigeria 42
Sweden 46
United State 42
For the above sample, determine the following measures:
a. The mean
b. The standard deviation
c. The variance
d. The mode
e. The 75th percentile

## 2. Thirty students in the School of Business were asked what their

subjects were. The following represents their response (M =
Management; A = Accounting; E = Economics; O = Others)

A M M A M M E M O A
E E M A O E M A M A
M A O A M E E M A M

## a. Construct a frequency distribution

b. Construct a relative frequency distribution.

## Question Bank / 169

3. The frequency distribution below was constructed from data col-
lected on the quarts of soft drinks consumed per week by 20
employees of a garden centre.

## Bottles of Soft Drink Frequency

0-3 4
4-7 5
8-11 6
12-15 3
16-19 2
a. Construct a relative frequency distribution.
b. Construct a cumulative frequency distribution.
c. Construct a cumulative relative frequency distribution.

## 4. In 2002, the average donation to the charity Organization was

Rs. 90,000 with a standard deviation of Rs. 180. In 2003, the
average donation was Rs. 1,60,000 with a standard deviation of
Rs. 240. In which year do the donations show a more dispersed
distribution?

5. Profit after tax for a company for the last six years is as given
below. Draw a bar diagram.

## 6. Explain census and sample method of data collection. Enumer-

ate the advantages of sample method over complete enumera-
tion.

## 7. Distinguish between classification and tabulation of statistical

data. What is their purpose?

## 8. Out of sample of companies, the inventory to sales expressed

as a percentage is given below. Construct a histogram from the
given data.

Inventory to sales rate (Percentage) No. of Companies

0 – 5.0 1

5.0 – 10.0 4

10.0 – 15.0 10

15.0 - 20.0 20

20.0 – 25.0 50

25.0 – 30.0 80

30.0 – 35.0 60

35.0 – 40.0 65

40.0 – 45.0 30

9. Find the median, lower, upper quartiles, 4th docile and 70th per-
centile for the following distribution.

Dividend Yield 0-4 4-8 8-12 12-14 14-18 18-20 20-25 25 above
Number of 10 12 18 7 5 8 4 6

Companies

## 10. Cans of soft drinks cost Rs. 30 in a certain vending machine.

What is the expected value and variance of daily revenue (Y)
from the machine, if X the number of cans sold per day has
E (X) = 125, and Var (X) = 50?

## 11. A crop insurance company establishes the following loss table

based upon previous claims
Percent loss | 0 25 50 100
Probability | .90 .05 .02 ????

## If they write policy that pays a maximum of Rs. 150/hectare,

their expected loss in Rs/hectare is approximately.

## Question Bank / 171

12. A rock concert producer has scheduled an outdoor concert. If it
is warm that day, she expects to make a Rs. 20,000 profit. If it
is cool that day, she expects to make a Rs. 5000 profit. If it is
very cold that day, she expects to suffer a Rs. 12,000 loss.
Based upon historical records, the weather office has estimated
the chances of a warm day to be 60 the chances of cool day to
be 0.25. What is the producer’s expected profit?

## 13. Explain the concept of probability? Give example.

14. If four coins are tossed once, write down the sample space.

15. If three units are tested, each unit will be either Good (G) or
defective (D). Write down the sample space for testing of 3 units.

16. A box contains 200 bulbs of which 20 are defective. If one bulb
is selected at random. Find the probability that is non defective
(Ans 9)

## 17. Suppose a die is tossed 5 times. What is the probability of

getting exactly 2 hours?

## 18. The probability that a student is accepted to a prestigious col-

lege is 0.3. If 5 students from the same school apply, what is
the probability that at most 2 are accepted?

19. What is the probability that the series which ends when a team
wins 4 games will last 4 games? 5 games? 6 games? 7 games?
Assume that the teams are evenly matched.

## 20. An Insurance company insured 1500 scooter drivers. 3500 car

drivers, & 5000 truck drivers. The probability of an accident is
0.05, 0.02 and 0.10 respectively in case of scooter, car and
truck drivers. One of the insured person meet, an accident, what
is probability that he is a car driver?

21. In a certain university the percentage of Hindu, Muslim and Chris-
tians among students is 50.25 and 25 respectively, If 50% of
Hindus, 90% of Muslims and 80% of Christians are smokers.
Find the probability that randomly selected student is a Mus-
lim. Use Baye’s theorem.

## 22. A company has four production sections S1, S2, S3 and S4

which contribute 30%, 20%, 28%, 22% respectively, to the total
output. It was observed that these sections produced 1%, 2 %,
3% & 4% defective units respectively. If a unit is selected of
random and to be defective, what is the probability that is from
S1 or S4.

Mean 10 90

S. D. 3 12

## Correlation Coefficient = 0.8

a. Calculate two regressions live
b. Find the likely sales when advertising expenditure is Rs
15 lakhs
c. What should be the advertising expenditure if the com-
pany sales target of Rs. 120 lakhs.

## 24. Obtain two lines of regression for the following data

X 43 44 46 40 44 42 45 42 38 40 472 57

Y 29 31 19 18 19 27 27 29 41 30 26 10

## Question Bank / 173

25. CALCULATE KARL PEARSON’S Coefficient of Correlation for
the data given below taking 66 and 63 are assumed means of
x + y respectively.

Height of Husband x 60 62 64 66 68 70 72

(in inches)

## 26. CALCULATE Coefficient and correlation and problem error from

the following data.

X 1 2 3 4 5 6 7 8 9 10

Y 20 16 14 10 10 9 8 7 6 5

## 27. Write short notes on

a. Coefficient of determination
b. Rank correlation.


  