STATISTICS
BY
D. N. ELHANCE, M. COM.,
H,ad qf thl D,partm,nt and Dean of 1M Faculty of Commerce.
Uniuersity of Jodhpur,
JDdhpur.
,
Printed by: Eagle Offset Printers. 15. Thornhill Road, Allahabad
Published by: Kitab Mahal, 15. Thornhill Road. Allahabad
IN MEMORY
OF
MY FATHER
PREFACE TO THE THIRTBB'lTfI EDlrfO~
A new edition of the now famous book on Statistics has come out
maintaining its old traditions intact but with new approaches all round
to register and record the various changing aspects.
Calculations have been recalculated in order to eliminate any
slightest variations which may haye crept in during the past years
Change to metric system has also been completed.
In its present fOJ;m the utility of the book has increased consider.
ably, and University students as well as administrators will find suffi.
cient material for their guidance and assistance.
The author will feel grateful to the discriminating student com
munity and the general users of the book for their indulgence in pinpoin
ting any error.
D. N. ELHANCB
PREFACE TO THE SIXTH EDITION
The present edition of this book has many new features. Two
new chaptersDesigns of Experiments and Statistical Q:lality Control
have been added in this volume. The chapter on Growth of Statistics
in India has been made uptodate.and latest figures have been substituted
for old ones.
Some chapters of this book have been reVised and new points have
been included in them. A large number of fresh questions have been
added at the end of each chapter to make the book more useful to exa
minees. The entire portion of Indian Statistics has been brought ·upto
ate.
I hope the present volume would be found useful by the students
of the subject. J am grateful to a number of students and friends" ho
gave me valuable sug,gestions for the improvement of the book and
1 am confident that they would continue to do so in future also.
D. N. ELHANCB
PREFACE TO THE SECOND EDITION
From the various reviews which appeared in a large number of
journals and papers, I conclude that the first edition of this book was
very well received. In the present edition I have rearranged certain
chapters and made the chapter on Growth of Statistics in India upto
date. Besides, I have included a large number of new questions at the
end of each chapter.
The book is now divided in two volumes. V.olume I covers tbe
eatire B. Com., B.A. and B.Sc. course of statistics of all the universi
ties of India and Pakistan. Volume II contains chapters on Probability,
1 heoretical Frequency Distributions and Sampling. Tbe two volumes
ale available separately as well as in a combined form.
I am grateful to a large number of friends who have given me
valuable s.uggestions for the improvement of the book. 1 hope the
students of the subject would lind the book more useful than before.
151h April. 1958. D. N. ELHANCB
PREFACE TO THE FIRST EDITION
The science of statistics has assumed great importance in recent
years. It was once known as the "Science of Kings" and its scope
was extremely limited, but today the science of statistics has become
an allimportant science, without which no other science can progress.
Modern age is the age of statistics and it is very correctly said that the
extent of the economic development of a country can best be known
by finding out the extent to which statistical organisation has developed
there. Till recent! y the foreign government of our country and even
our countrymen were very indifferent towards statistics. After inde
·pendence of the country the era of economic planning started and along
with it the importanc of statistics increased considerably, In fact
economic planning cannot be imagined in the absence of statistical
data.
o It is a matter of great satisfaction that the impottance of statistics
Is gradually being realised in our country and they are occupying the
place of honour which they should have got much earlier. Statistics
is now taught in almost all the universities of the country and there are
a number of statistical institutes which impart special trainihg in ~his
subject. This book is an attempt to furnish a simple, nonmathemat1cal
text for those who desire to equip themselves with a knowledge of the
elementary statistical methods used in modern times. The treatment
of this subject has been as far as possible of a nonmathematical character
because most of the students who study this subject do not always have
a mathematical background. This book has been written primarily
for use of M.A., M.Com., and B.Com. students who study this subject.
The book covers the entire course which is prescribed for the statistics
paper in these examinations in various universities of the country as
also the courses prescribed in LA.S. and P.C.S. examinations of the
paper. A large pumber of questions have been given at the end of
each chaptet with a view to help the students in solving numetical pt'o
blems and thus familiarising themselves with different types of formula~
used in statistical analysis.
I am grateful to my colleagues in the Faculty of Commerce, Alla~
habad University, who have given me some ver} valuable suggestions,
Thanks are also due to Mr. S.V. Erasmus, my secretary, who worked
almost like a machine for all the days during which this book was
written. Kitab Mahal, my publishers deserve congratulations for the
nice printing and getup of the book.
IJI December, 1956. D. N. ELHANCE
CONTENTS
CHAPTER Page
)'r Meaning and Definition of Statistics 1
2. Origin and Growth of Statistics 8
/ Importance, Limitations and Functions of Statistics 16
4. Preliminaries to the Collection of Data 33
5. Collection of Primary and Secondary Data 41
6. Accuracy, Approximation and Errors 53
....:w Classification, Seriation and Tabulation 63
8. Ratios, Percentages and Logarithms 80
C Measures of Central Tendency 87
%. Measures of Dispersion 178
11. Moments, Skewness and Kurtosis 236
12. Index Numbers 250
~. Diagrammatic Representation of Data 300
..)..4. Graphic Representation of Data 347
15. Analysis of Time Series 405
1Jj. Correlation 454
17. Regression and Ratio of Variation 508
18. Theory of Attributes and Consistence of Data 528
19. Association of Attributes 546
20. Interpolation 577
21. Business Forecasting ... 610
/22 Interpretation of Data 619
,.23: Probability 629
24. Theoretical Frequency Distributiolls 654
25. Theory of Sampling 676
26. Sampling of Attributes 689
27. Sampling of Variables (Large Samples) 706
28. Chisquare Test and Goodness of Fit 736
29. 'Sampling of Variables (Small Samples) 757
30. Analysis of Variance 783
31. Designs of Experiments 796
32. Statistical Quality Control 802
33. Growth of Statistics in India 814
34. Mathematical Tables 994
DET A ILED CONTENTS
Statistical methods
The second sense in which the word statistics is used refers to
the statistical principles and methods used in collection, analysis and
interpretation of data. In this sense the word is used in singular.
Statistical methods (or statistics) have a very wide range. They include
. not only simple and conunonly known devices of comparison and
analysis. but also highly technical and mathematical formulae which
are capable of being understood only by experts who have received
special training in this subject.
SttllislictJl methods IJIIIl experimllltlJl IIIIthods. Statistical methods
include all those devices which are used in collection and simplification
of nUIllerical data so as to render them capable of being analysed, and
conunonly understood without much difficulty. ,Statistical methods are
different from experimental methods in as much as the latter are more
accurate and precise than the former. In experimental methods it is
possible for us to study the effects of anyone of the many factors affect
ing a phenomenon individually by making the other factors inoperative
for the time being. Thus in physics it is not difficult to study the effects
of, say, only heat on the density of air by making other factors in
operative for the duration of study. But the same thing is not possible
in statistical methods. It is not feasible to study the effects of, say,
only inflation on prices. The effects of inflation cannot be separately
studied from the effects of many other factors like demand, supply,
exports and imports, etc. However, by the use of statistical methods
it is possible to have a rough idea of the effects of inflation upon prices.
Statistical study c~aot be as accurate as the study done by experimental
methods. Thus we see that statistical methods are comparatively less
accurate and are usually applied in inexact sciences like sociology though
even in physical sciences (which are classed as exact sciences), the use
of these methods is sometimes necessary. Statistical methods are thus
of universal application though their primary field is social sciences.
Thus "Statistics are numerical facts, but statistics is a body of
methods for making decisions when there is uncertainty arising from
the incompleteness or the unstability of the information available. The
decisions may be made either for the practical purpose of selecting
a course of action or for the scientific purpose of gaining genera]
knowledge."
DEPINITION
The term Stmsliu has been defined differeady by different au
thors. Some authors have defined the word as used in the first sense
(of numerical d~ta) while others have .defined it as. 1l:sed in the second
sense (of statistical methods or the sCience of statistics).
Firat Type
Of the first type of definitions the one given by HortJce Secrist iJ
the most exhaustive. It is as follows 
MEANING AND DEPINITION OP STATISTICS 3
·'By statistics we mean aggregates of facts affected to a marked
extent by multiplicity of causes numerically expressed, enu
merated or estimated according to reasonable standards of
accuracy, collected in a systematic manner for a predeter
mined purpose and placed in relation to each other."
This definition makes it clear that statistics (as numerical data)
should possess the following characteristics : 
(i) They should b6 aggregates of facts. Single and unconnected
figures are not statistics. A single age of 25 years or 40 years is not
statistics but a series relating to the ages of a group of persons would
be called statistics. A single figure relating to birth, death, purchase,
sale, accident, etc., does not form statistics though aggregates of figures
relating to births, deaths, purchases, sales, a~cidents, etc., would be
called statistics because they can be, studied in relation to each other and
are capable of comparison. It is possible to study them in relation to
time, place and frequency of occurrence.
. (it) They should be affected to a marked extent by multiplicity oj causer.
Usually statistical facts are not traceable to a single cause. Since statis
tics are m~st commonly used in social sciences it is only natural that
they are affected by a large variety of factors at the same time. It is
usually not possible to study the effects of anyone of these factors se
parately as is the case in experimental methods. In statistical methods
the effects of various factors affecting a particular phenomenon are
generally studied in a combined form though attempts are also made
to study the effects of different sets of factors sepllrately as well. Most
of the statistics, however, are affected to Ii considerable degree by mul
tiple causation. For example, statistics of prices are affected by con
ditions of supply, demand, exports, imports, currency circulation and
a large numbet of other factors.
(iit) They should b6 numerically 6Xpre.rS6J. Qualitative expressions
like good, bad, young, old, etc., do not form part of statistical studies
un1e~" a numerical equivalent is assigned to each such expression. If
it is said that the production of wheat per acre in 1953 was 100 maunds
and in the year 1954 it was only 60 maunds or if it is said that of two
perspns A and B, A is 20 years old and B 60 years old, we shall be mak
ing statistical statements.
(iv) They should be enumerated or estimated according to reasonable
standards oj acctlraqy. Numerical statements can either be enumerated
in which case, they are supposed to be accurate and precise or they can
be estimated by some expert observers. Where the scope of statistical
enquiry is very wide or where the numbers are very large, enumeration
i~ usually out of question and in such cases :ligures can only he estimated.
It. is obvious that estimated :ligures cannot be absolutely accurate and I
pI(cise. The degree of accuracy expected in such :ligures depends to
a large extent on the purpose for which statistics are collected and also
4 JroN1).umNTALS 01' STA'rlS'nCS
on the nature of the particular problem about which data are being
collected. There cannot be a uniform standard of accuracy for all
types of enquiries. Forexample. if the heights of a group of individuals
are being measured it"is all right i1' the measurements are correct to the
tenth of an inch but if we are measuring the dista.ri.cc from Bombay
to Calcutta, a difference of a few: furlongs even, can be easily ignored.
(v) They shOtlld bl coll,;leti in " syslll1JaliG l1Jamur. If figures are
collected in a haphazard fashion Ole can never be sure about the degree
of accuracy of such _data. It is, therefore, essential that statistics must
be collected in a'systematic manner so that they may conform Jo re
asonable standards of accuracy.
(VI) ThU .rooflld be collettetifor a J>f'Itilllrmineti Pll1"poSI. It is obvious
that if statistical data are not collected with some predetermined aim
their usefulness would be almost negligible. Figures, are usually collect
ed with some end in view, as without it all the efforts made in the collec
tion of figures would be completely wasteful and the figures so collected
would not be in any way us<tuI.
(viI) The} should h, pfaffti in r,lation 10 ell&D Dlher. Statistics are
collected mostly for the purpose of comparison. If the collected figures
are not capable of being compared with each other they. lose a very
large part of their value. It is n..ecessary that the figures which' are
collected should be a homogeneous lot because it is not possible to
compare figures which are of a heterogeneous character and which
cannot be placed in relationship to each other. 1£\ for example, the
height of a person and the money spent by him in getting his house
constructed are placed together it does not make any sense and the figures
cannot be compared to each other. Such figures naturally do not come
under the category of statistics.
Webster has"also defined statistics in the same sense in which Secrist
has defined it. Webster's definition of statistics is as follows:
"Statistics are the classified facts respecting the conditions of
the people in a state ... speclally those facts which can be stated
in numbers or in tables of numbers or in any tabular or classified
arrangement."
This definition is rather narrow. It confines statis~cs onlYrto
those facts which relate to the condition of the l?eople in a state. '.Ilhis
was a very old concept o~ the word statistics and it does not suit modern
conditions. At presen~ statistics relate to all aspects of human activity
and as such this definition falls short of the modern concept of the term.
Moreover, this definition is not as clear and exhaustive as the one given
by Secrist. :
Second Type
Of the second type of definitions of the' term statistics (as statis
tical methods or science ~f statistics) the oni: given b'1 S,ligm4n is very
short and simple and yet quite comprehensive. According to Selig
WlANING AND DEFINITION OF STATIS1'ICS
man' 'Statistics is the science which deals with the Ibcthods of collecting,
classifying, presenting, comparing and interpretin~ numerical data col
lected to throw some light on any sphere of enquIry."
. Acco~ding to King <"the science of statistics is the method of judg
tog colleCtive, natural or social phenomenon from the results obtained
from the analysis or enumerarlon or collection of estimates." This
~efinitjon is not very exhaustive and it limits the scope of the science
of statistics. The author himself admits this defect but is of the view
that for practical purposes it is all right.
A. L. Bowley has given a series of'definitions but most of the de
finitions given by him are not complete and lay emphasis only on
some;: of the aspects of the science. At one place Bowley says, «Statis
tics may be called the science of counting.... At another place he is of
the view that "Statistics may rightly be called the science of averages".
Both these definitions are defective as the science of statistics does not
confine itself either to counting or to averaging alone. Th~se are no
doubt important statistical methods but they do not cover the entire
field of the science of statistics. Yet another definition given by the
same author characterises statistics as "the science of measurement of
tbe social organism regarded as a whole in all its manifestations."'" O'b
viously this definition limits the application of the statistical methods
to only one field, namely, sociology. Bowley realised this limitation
and he himself writes at another place that statistics cannot be confined
to anyone science.
Bodtlington has defined statistics as the science of "estimates and
probabilities." This definition gives expression only to certain methods
by which conclusions are derived in this science. No doubt in most
of the cases statistics are estimates and 'probabilities' but it should be
remembered that the scope of the science is not confined merely to
these things.
Lovitt ddfines the science as 'that which deals 'With the collection
classification and tabulation of numerical facts as th6 basis for explana~
tion, description and comparison of phenome~." This ·definition is
fairly satisfactory and it indicates that the science .of statistics is a sim
ple and scientific exposition of statistical methods.
Having briefly discussed some 0% the definitioJj.s of the term statis
tics and having seen their drawbacks we are now w. a position to give
a simple and complete definition of the term in the following words : 
Stati.rtiu (a.r lued in the .ren.re,oj data) are ,numerical .rtatement.r oj jart
rapable of analy.ri.r and interpretation and the sfienre of J/ati.rtiu i.r a .rtudy oj
thl prinripies and method.r u.red in the rollertion, pre.rentation,analy.ri.r and inter
pretation oj nttmeriral data in any .rphm oj en(illiry.
MAIN· DIVISIONS 01' THE STUDY 01' STATISTICS
Statistics as a science can be divided into two JJlain classes, namely,
,,:,fati.rtirai mlthods and applild .rlat;.rtifl.
.6 FUNDAMENTALS OF STATl:.rICS
t. Statistical methods
Under statistical methods are studied all those devices" rules of
procedure and ge~eral principles which are applicable to all kinds or
grou,ps of data. Thus they include all the general principles and tech
niques which are commonly used in the collection, analysis and inter
pretation of data relating to any sphere of enquiry. Statistical methods
are the .tools in the hands of a statistical investigator. These are devices
for achieving the desired ends explained in theory. Since a method is
always a means to an end, its acc·uracy and precision depends on thl'
object which is desired to be· achieved and this in turn is considerably
affected by the peculiar features .of the problem to which it is related.
This is the reason why different statistical methods are usc:! in different
types of enquiries and no uniform standard of accuracy is desired to
be achieved in different types of investigations.
a. Applied 8tatis~C8
Applied statistics deal with the application of statistical methods
to specific problems or concrete forms. If we have to estimate the
national income of a country or its industrial or agricultural production
then the special techniques followed to achieve these ends and the re
sults obtained thereof would form part of applieu statistics. As IS
clear from the above explanation applied statistics can be further divideCl
into two m.ain groups. They may be either descriptive or scientific.
Dmriptive applied statistics deal with data which are known and
which naturally relate either to the present or to' the past. For example,
business statistics are descriptive applied statistics, as they deal with
the analysis, measurement and presentation of business facts relating
to past or present. On the basis of these facts decisions about various
business problems are usually taken.
Scientific applied statistics deal with the formulation of physical
and psychological laws on the basis of quantitative data collected for
descriptive purposes by the use of appropriate statistical methods. If.
for example, by the use of soine business statistics we are in a position
to derive certain conclusions, which we use for forecasting the future
trend or tendency of that particular phenomenon, we are making use
of scientific applied statistics. For purposes of business forecasting
we have to make use of such statistics.
OBJECTS OF STATISTICS
Questions
t. Explain clearly the concepts of statistics, statistical methods ...ad statistical
siences.
2. Examine the main differences hetween statistical, methods <and experimental
methods.
3. Critically CKamine the following de.6nitions of statistics: "Statistics is a.
>cience of counting", "Statistics is a science of averages", and, "Statistics is a sdene"
of the measurement of social organism in all its aspects". (B. C(IfII. Agra, 1'943).
4. Discuss the meaning and scope of statistics.
s. "Statistics affects everybody and touches life at man¥ points. It is both
ascience and an art." Explain the above statement with appropriate examples.
(B. Co",. Agra, 1946).
6. "Statistics of a business can be tre~ted scientifically and the preparation
and study of business statistics may be made a more e&act science than the study of
national and social statistics". Explain. (B. Co",. Allahabad, 1932).
7. "Science without statistics beats no fruit, statistics without science have
no root." Rxplain the above statement with necessary comments.
(M. A. P4lnfl, 1943).
8. "Statistics is cooperative counting." COInIl"ent.
9. What ate the characteristics that statistics (statistical data) possess. Explain
with illustrations.
10. What are the main divisio.ns of statistics. Illustrate with examples.
n. Write a note on the objects of statistics.
12. "Statistical methods include all those devices of analysis and synthesis by
means of which statistics are scientifically collected and used to explain or describe
phenomena either in their individua lor related capacities", Co'Dtt!ent on the above
statement.
". Explain with aIustrations how statistical methods tend to clarity of thOl1ght,
accuracy of estimates, verification of theories and discovery of relations.
(B. Co",. Agra, 1947).
14. UBy statistics we mean quantItative data affected to a marked extent by a
multiolicity of causes". Explain, (M. Co",. Agra, 1945).
IS. In whd ways can statistical methods be misused by interested persons
Give at least two caramplell of the misuse of statistics.
16. "A statistician is not an alchemist expected to produce gold from any "Vorth~
less material," Comment.
Origin and Growth of
Statistics 2
Early Beginninge
The origin of statistics is suggested by the derivation of this word.
It seems to have been derived from the Latin word stati.t which means
a political state. In fact the origin of statistics was due to administrative
requirements of the state. Statistics in the past were a byproduct of
administrative activity. The administration of the states required the
collection and analysis of data relating to population and material wealth
of the country for purposes of war and finance. The earliest form of
statistical data, therefore, relate to census of population and property
collection of data. for other purp,oses, however, was not entirely ruled
out. Perhaps one of, the earliest censuses of population and wealth
was held in Egypt as early as 3050 B. C. for the erection of pyramids.
RamlSlI II conducted a census of all lands of Egypt. During the Middle
Ages such censuses were held in England, Germanyand other Westem
countries as well. In India about 2000 years ago we had an efficient
system of colleGbng administrative statistics. During the' Hindu period,
particularly during the Mauryan regime, our country had an efficient
system of collecting vital statistics and of the registration of births and
deaths. Ain~;Akh4r; gives us a detailed account of the administrative
and statistical survc;y conducted during the reign of Emperor Akbar.
The histories of th~ other countries of the world also clearly indicate
that in ancient times statistics was regarded as a. matter connected with
the activities of the state and that is why it was known as a science of
statetUaft. The systematic collection of offiCial statistics originated in
Germany towards the end of the eighteenth century. In its earliest
form it was an attempt to assess, for political purposes, the relative
strengths of the German states by comparing population, industrial and
agricultural output. In England, statistics is a legacy of the Napoleonic
Wars. In order to raise new taxes that the cost of the war demanded,
it was found necessary to collect such facts and figures which would
enable government to have an idea about the probable revenue and
expenditure more accurately.
Sixteenth Ceutury
These spasmodic attempts made in ancient times to collect certain
facts and figures can be left out of account as in those days statistical
methods were not properly developed and 'We do not know the tech
nique by which these figures were collected. Most of these figures
e.re not available and all that we kno'W is that such statistics 'Were collected
ORIGIN AND GROWTH OF STATISTICS
In those days. It has been only within comparatively recent times that
mankind has realised the utility and usefulness of collecting statistics
relating to the phenomena of physical and social universe. Prior to it,
the astronOl:n.~s used to record the movements of heavenly bodies like
stars and planets to foretell their position and to make forecasts about
eclipses. Tycho Brahe (15461601) collected valuable information about
the movements of planets and johannu K,pler made an exhaustive study
of these data and discovered the three famous laws relating to the move
ment of planets. It was on the basis of these laws that Sir Isaa& N,w/on
formulated his theory of gravitation. Sir Frant:is Bacon (15611626)
was of the opinion that a proper knowledge of nature can be obtained
only on the basis of the study of data relating to various forms of nature,
and under his influence this method was adopted by scholars in various
fields. When these methods proved their efficacy in physical sciences
and when it was found that the results obtained by the use of these devices
were very accurate, social sciences like politics, econqpUcs and sociology,
all adopted statistical methods for the formulation of their theories
and for testing the degree of accuracy ~chieved by them.
Seventeenth Century
Eighteenth Century
The modern theory of statistics can be said to have been formulated
by L. A. Ji QHelict (17961874). He put forward the notion of 'average
man' whose actions, he stated, conform to the 'average rc;,:;ults obtained
from society.· He was further of opinion that the action and beha
viour of other persons deviated from this form in a lesser or greater
degree and these deviations from this theoretical average were capable
of being treated by the method of errors and probability. He also
emphasised the im1?0rtance of the 'law of large numbers' which was
founded by Jacob Bernoulli.
In fact the science of statistics is highly indebted to the games of
chance. G. Cartlano (15011536) who was a great mathematician and
at the same time a big gambler also, wrote a valuable treatise on the
hazards of the .game of chances and he pronounced certain rules by which
the risks of gambling could be minimized and one could protect him
self against cheating. These rules were based on the correct approach
to the problems which we, in modern times, study under the theory
of probability. Jacob Bernoulli and his nephew Daniel BernoHIIi (1700
1782) laid a solid foundation of the theory of probability and put forward
the idea of 'moral expectation'. It was after this that Pierra Silllon de
Lapl(lce (17491827) published in 1782, his monumental work on the
theory of probability. This work is recognised as one:: of the best ever
done on the subject of probability. It is both mathematical as well as
philosophical. Later on most of the prominent mathematicians of
the eighteenth and nineteenth centuries like Moillre, Fiuier, Lagrange,
Chrystal, Btges, TodhHnter, GaHss, MorgaH, Lexis and Charlier, to mention
only a few names, contributed to the subject to probability.
Nineteenth and Twentieth Centuries
On these foundations laid by the mathematicians of the eighteenth
and nineteenth centuries modern theory of statistics' was gradually built
up. G. F. Knapp (18421926) and W. Lexis (18371914) contributed
valuable works on the statistics of mortality. Sir Frands Galton (1822
1911) was the first to introduce statistical methods in the field of bio
metry. Later on Karl Pearson took up this chain and his work on the
subject is too well known to need any detailed description. In the words
of Pearson himself, "the whole problem of evolution is a problem of
vital statistics, a problem of longevity, of fertility, of health, of disease
and it is impossible for the evolutionist to proceed without statistics as it
would be for the Registrar General to discuss the National Mortality
without an enumeration of the population, a classification of .deaths and
a knowledge of statistical theory."
It 'Was in the second half of the last century and in the present
century that statistical methods entered the realm of the science of eco
nomics and became intimately associa~d 'With the ancient subject of
mathematics. Though relationship of statistics and mathematics is
very old yet it is only during the last tOO years or so that the two sciences
have come very ~ose to each other. In recent years the domain of
ORIGIN AND GROWTH OF STATISTICS 11'
statistical methods has considerably widened and today there is hardly
any science which does not make use of statistical methods. The science
of statistics is now associated with all other sciences in some form or
the other and we shall now study the relationship of statistics with other
sciences particularly. with economics and mathematics. For the past
two decades particularly there has been a remarkable and sustained
growth in the use of statistics. This is because business, government
and science, three fields in which applications of statistics are most nu
merous and di\'erse, are growing in volume and complexity. It is
also because of the technological revolution which has taken place in
data handling, affecting especially computing and tabulating equipment,
and a scientific revolution in statistical theories and techniques.
RELATIONSHIP OF STATISTICS WITH OTHER SCIENCES
Statistics and Economics
Though the relationship of statistics ..,ith economics dates back to
1690 when Sir William Petty published his book named "Political Arith
metic" yet the relationship of these two sciences became intimate rather
very late. No doubt statistical data ab9ut economic problems used to be
collected in the past but there was no relationship between statistics and
economic theory. In earlier stages of development the science of eco
nomics was based on deductiol. and the predominance of deductive
approach was responsible for the disinterest of economists towards
quantitative data for purposes of the development of economic doc
trines. Besides this, there was also a tendency in those days to avoid
figures which were considered to be lifeless, rude and coarse. What
was responsible for this peculiar disposition to figures in those days if,
difficult to state. It is a fact that people wanted to avoid rude shocks
which awaited them in the world of facts and always wanted to be vague
in their statements and logic. Gradually this hatted for figures melted
away and even deductive writers like J. S. Mill admitted that "in some
cases instead of deducing our conclusions from reasoning and verifying
them from observations we begin by obtaining them provisionally from
specific experience and afterwards connect them with the principles of
human nature by a priori reasoning." Similarly in 1871 W. S. lepons
wrote that "the deductive science of economy must be verified and
rendered useful from the_purely inductive science of statistics. Theory
must be invested with the reality of life and fact. Political economy
might gradually be erected into ,the exact science, if only commercial
statistics were far more complete and accurate than they are at present
so that the formulae could be endowed with the exact meaning by. the
aid of numerical data. Jevons developed the technique of an~ysis of
timeseries and was the pioneer in the field of price studies and index
numbers. Rightly he has been called the 'Father of Index Numbers'.
Besides Jevons the Historical School (18431883) also brought statistics
and economics close to each other. In fact Roscher, Knies, and Hilde,.·
brand, all were of the 'opinion that economic doctrines should not be
argued in the abstract and that they should be inductively verified. The
12 FUNDAMENTALS OF STATISTICS
·effect of the preachings of Historical School was indeed very great and
the science of economics no more remained merely deductive in its
approach. By th~ time the .present century began, much of the opposi
tion to .the use of .statistical methods in the realm of economics had
elided and in 1907 Ai/rId Marshall could write, "Disputes as to methods
have ceased. Qualitative analysis has done the greater part of its work ...
··that is to say, there is general a~reement as to the charactc.cistic and
duration of the changes which varIOUS economic forces tend to produce.
Much less progress has been made towards the quantitative determina
tion of the relative strength of different economic forces ...... that higher
·and more difficult tasks must wait upon the slow growth of thorough
realistic statistics." At the same time Pareto wrote, ':The progress of
political economy in the £uture will depend in great part upon the in
vestigations of Impiri&a/laws derived from statistics which will then be
compared with known theoretical laws or will suggest derivation from
them of new laws." Later on Lord KeylUs writirig abaut the functions of
statistics w[Ote that it is "first, to suggest '~f::al /aWt, it mayor may
not be capable of subsequent deductive exp tions; and. secondly, to
supplement deductive reasoning by checking its resu,lts and submitting
them to the test of experience." Now there are no tWo opinions about
tIie fact that both induction and deduction are necessary for the growth
and development of economic science. In fact statistics and economics
are so intermixe~ with ea~ other now that the question of th~ir separa
tion does not arIse. .
Fa&tors responsibl, for &Ioser lies b,/ween Itonomiu aIId sfa(isliu. Since
1890 two factors have worked together to bring about this great change in
the relationship of statistics and economic:s. 'the Brst is the develop
ment of statistical methodsof probability G:dd sampling, simple and
partial correiatj9n and association, periodicity'an<l index Jl11ID.bers, etc.,
the second is the enlargement of statistical material in recent years. In
fact during this period various eminent statisticians like C. B. Datl,nporl.
A. L. Bowley, W. Pearson. W. I. King and R. A. Fisher. etc. have made very
valuable contributions towards the developments of the science of statis
tics. During this period the statistic~ data have also increased in
quantum allover the wo.rld on account of the establishment of statistical
bureaus in various countries. Tpe improvement of statistical methods
and the expansion of statistical data have thus brought economics and
statistics very close to each other and have marked the real. inception of
statistics in the domain of the science of economics.
Statistics, economics and mathematics
It has already been mentioned above that statistics and mathe
matics have been closely in touch with each other eve.r since the seven
teenth century when theory of probability was found to have bearing on
various. Cltatistical methods. During the last 100 years or so not only
statistics and mathc;.matics have come very close to each other due to the
dc;velopment of mathematical statistics, but these sciences have been
joined by economics as wells and now there.is a happy union between
statistics, economics and mathematics, Mathematics has considerably
'
OllIGIN AND GllOW'I'H OP S'l'ATISTICS
study of the significance of these deviations has also to be made for various
purposes. All this cannot be done without the use of statistical methods.
We thus find that the science of stat1stics helps meteorology in a large
number of ways.
The above account of the origin ~d growth of statistics clearly
reveals the fact that the great science of statistics is associated with all
the other important sciences both physical as well as social. In fact
today the domain or statistics' is very wide, it is almost universal and
it is difficult to imagine any science worth the' name where statistics has
not proved its usefulness in some form or the other. Bowley was right
when he said, "A knowledge of statistics is like a knowledge of foreign
language or of algebra; it may prove of use at any time under any cir·
cumstances. "
Callser of the recent growth of Statistiu. The tremendous growth in
the use of statistics, l'.S has been shown above can be attributed mainly to
two factors, "i~. :increased demand of statistics and decreasing cost of
statistics.
(I) Increased dlmand. There has been a phenomenal increase in the
demand. for statistics in various fields. Statistics are most commonly
used by businessmen, government and scientists. The spheres of the
activities of all these three categories have increased extraordinarily in
modern times. The magnitude of business has considerably increased
resulting in an increased demand of statistics. The business in modern
times has become a very complicated affiUr and this fact has further aug
mented the demand of statistical data. The complexity in business is
on account of numerous government regulations, laoour disputes, ever·
increasing taxes ~d technologjcal revolution which the business world
has witnessed in recent years.
Even more than business activities, the activities of the government
have incJ;eased both in size a.II well as in complexity. Modern states are
welfare states and they have to look after a large variety of things result
ing in an increased demand of statistics.
Probably the most spectacular development of modern world is the
growth of scientific research. Science today is a very complex pheno
menon and different types of researches in the field of science are of an
e~emely complex nature and they make an extensive use of statistical
data. We thus find that the demand of statistics has considerably in
creased and this is one reason why the science or statistics is developing
so fast.
(;/) Decreasing Cost. Another reason why the science of statistics
has developed so fast and has become so popular is that on account of
a number of reasons the cost and the time required for the collection and
analysis of data have gone down. There has been a vast improvement in
the technique of processing the data which has resulted in great economy
of both time and cost. Modern computing and tabulating machine!:
not only save time but money also. The development of electronic
calculators and other modern machines like deskcalculators and card
ORIGIN AND GROWTH 010 STATISTICS 15
sorting machines etc., have made the task of scientists, businessmen and
administrators very easy and simple. They have resulted in a very great
economy both in terms of money as well as the time needed to do a job.
Statistical theory has also developed in modern times in such a
manner that the cost of compilation of statistical data has gone down
considerably. The theory of sampling and various designs of experi
ments and statisticallJ.uality control have all contributed towards lower
ing the cost of collection and analysis of statistical data.
Questions
I.Write a shott essay on the origin and growth of the science of statistics and
throw light on its future.
2. &plain the relationship between ~conomics and statistics.. How far has
the use of statistical methods in economics led to its development ?
• (B. Com. Lt«kf/4flJ 1941)
,. "Statistics are the straw out of. which every other economist has to
make the bricks." (Marshall).
B'It()lain, in the light of the above observation,the relation between ec)l1omics
alld statistics and discuss how far it is correct to say that the science of economics is
becoming statistical in its method. (M. Com. Allahabad, 1944).
4. Trace the association of mathematics with the science of statistics and show
how the former has considerably helped the development of the latter.
s. Discuss the relationship between statistics and various soclal sciences.
6. Do you think that statistical methods are of any help in physical scJences?
If 80, how?
7. Write a brief essay on the relationship of economics, statistics and mathe
matics.
8. Show how the science of statistics which was originally the science of state
craft has now become the sclence of universal application. Do you think that statistical
methods are in reality applicable to all types of sciences ?
9. How far has the growth of statistics coincided with the development of
physical and BQ.cial sciences ?
10. "Statistics is an apparatus by the help of which the validity of the laws of
physical and social sciences can be tested". Comment.
II. Discuss the factors responsible for the quick development of statistics in
recent years.
Importance, L imitation and
Functions of Statistics 3
Sflltisnrs and th, coml11on man. The fact that in the modern world
statistical methods are of universal applicability, is in itself enough to
show how important the science of statistics is. As a matter of fact
there are millions of people all over the world who have not heard a
word about statistics and yet who make a profuse use of statistical me
thods in their daytoday decisions. Statistical methods are common
ways of thinking and hence Rre used by all types of persons. When
a .person wishes to purchase a car or a radio and he goes through the
price lists of various companies and makers to arrive at a decision, what
he really aims at, is to have an ideaabout the average level and the range
within which the prices vary, though he may not know a wQtd about
these terms. When a farmer wishes to have a particular quantity of
tain in a p~ticular season so that he may have a good crop, he has in fact
an idea of the correlation that exists between rainfall and crop yields and
the regression line of. crop yields on rainfall. Again when we use a
common proverb ·'as you sow, so you reap" we indirectly pint that there
is a positive correlation between one's actions and achievements.
Examples can he multiplied to show that human behaviour and
statistical methods have much in common. In fact statistical methods
are so closely connected with human actions and behaviour.that practically
all hvroan activity can be explained by statistical methods. This shows
how important and universal statistics is.
CAUSES OP nIB IMPORTANCE OP STATISTICS
Simplifies Gomplexi(J. One reason why statistics is so important
today is that it simplifies complexity. Human mind is not capable of
assimilating huge facts and figures, and statistical methods, by making
these data easily intelligible and readily understandable render a great
service, because in its absence the information 'Would not have been
of any use. Statistical methods describe a phenomenon in a very simple
fashion. If, for example, we have to study the economic system of
Soviet Russia we cannot properly understand it by a purely descriptive
method in which no statistics are used, but if the different aspects of tho
economic system are numerically eXpressed we can und~rstand the whole
thing in a short time and in a better manner.
'M£asures·rU1IIts. Similarly if we have to measure the results of
particular policy it can best be done by statistical methods. If we have
to study. for example, the effect of a rise in the bank rate on the industries
of a country we c~n do so in a proper manner only by, means of a statistical
IMPORTANCE. LIMr1'ATIONS AND FUNCTIONS OP STATISTICS 17
They tell us about the' volume of business done in a country and the
amount of money in _circulation. Distribution statistics disclose the
economic conditions of the various classes of people. They throw light
on the distribution of national dividend amongst the inhabitants of a
country. We thus find that in all types of economic problems statistical
approach is essential and statistical analysis useful. Mathematics and
"its offsprings, statistics and accounting, are the powerful instruments
which the modern economist has at his disposal, and of which business
through the development of research agencies and JIlethods. is 'making
constantly greater use.
Need in planning. Modern age is an age of planning. The days
of laisse~ faire are gone and state intervention in practically all aspects
of life has become universal in character. Today. we live in ~ period of
transition; economic activities are being more and more closely directed
to the production of such goods, and the provision of such services, as
the government may decide to be most urgently required~. Our future
is 'Very largely being pla111led, and this planning, to be successful must be
soundly based on the correct analysis of complex statistical data. When
ever we thiuk of a plan we have to think of statistics. Planning cannot
be imagined without statistics. If we study the economic plans imple
mented in various countries in recent times we will·find that all of them
are a statistical study of the economic resources of the respective countries,
and they suggest possible ways and means of utilising these resources
in the best possible manner. Various plans that have bTen prepared
for the economic development of India have also made \1se of the statis
tical material available about various economic problems. The fact that
in our country the amount of statistical material available to the planners
has been very scanty, is responsible for many drawbacks and inaccuracies
in different plans. Not only plans of economic development are construc
ted on the basis of statistical data but the success that a plan achieves is
also measured best by the use of statistical apparatus. We thus find that
in the field of economic planning the use of statistics is indispensable:
Usefulness in commerce. .~tatistics are an aid to business and com
merce. In fact today the situation is, that a businessman succeeds or fails
according as his forecasts prove to be accurate or otherwise. When a
man enters business he enters the profession of forecasting, because success
in business is always the result of precision in forecasting and failure in
business is very often due to wrong expectations,. which arise in turn due
to faulty reasoning and inaccurate analysis of various causes affecting a
particular phenomenon. Modern devices have made business fore
casting more definite and precise. Economic barometers are the gifts
of statistical methods and businessmen all over the world make extensive
use of them. A producer estimates probable demand of his goods, ana
lyses the effects of trade cycles and seasonal variations as also of changes
in habits and customs of people on the demand of his wares, and after
taking all these factors into consideration finally takes decision about the
quantum of production. A businessman who ignores the effects of booms
and depressions can never succeed and is bound to face frustrations as his
IMPORTANCE. LIMITATIONS AND FUNCl'IONS OF STAl'ISnCS 19
was generally believed that the Indian people wanted to resume tighting
again. A poil of public opinion carried out by a leading newspaper
r::vealed the following result :
Yes No No
Are ynu in favour of another round opinion
of fighting with Pakistan. 65 25 10
Uses in War
(i) Active lead by OJlicers. A statistical analysis of the Indian and
Paklstani casualties during the IndoPakistani War of September 1965
re vealed that the proportion of officers among those killed was higher
ou the Indian side. This showed that the Indian armies were actually
led by their officers and this was one of the important far.tors responsible
for Indian victory. This factor will assume importance in the formula
tion of future war strategy.
(ii) Training in the Use of War Eqllipment. The heavy reverses
suffered by Pakistan during the above War, despite its vastly
superior Air Force and armoured Corps came as a great surprise to the
whole world. Statistical analysis with its causes revealed that a high
. and positive correlation existed between the _period and intensity of
training in the use of aeroplanes and tanks and their effective use ill war.
A further investigation into the period and intensity training provided
in both the countries revealed that Pakistani failure td make an effective
use of its fighters. bombers and tanks was due to inadequate and inferior
training of its personnel
(iii) Inspection ofpurchases. During the war, military requirements
of goods and commodities increase tremendously. Complete inspec
tion of each and every item involved huge expenditure and time of a
large number of personnel and it can also not be done expeditiously.
Here statistics come to the help of the army. The use of sampling ins
pection method helps not only in its quick disposal but also gives accurate
results. Under this method, only a few items, say 2 per cent. are selected
on random sample basis and thoroughly inspected. This method is both
cheapet and expeditious. It also ensures accuracy as it is easier to ins
pect more closely a few rather than a large number of items.
LIMITATIONS OF THE SCIENCE OF STATISTICS
and yet many persons in the group might have become poorer than what
they were before. Statistical methods ignore such individual cases.
Shifting of Definition
I
(I) Monthly and Hourly Wage Rates. A firm had introduced pro
ductivity methods with the result that productivity had increased. Since
the demand for its product was inelastic and labour laws did not permit
retrenchment, it decided upon reducing the working hours. As a
result, the monthly rates of wages could be increased only marginally.
A dispute arose between labour and management. The contention of
the labour was that despite significan~ increase in productivity, the wage'S
had increased only marginally; and in support of its argument, it de
monstrated monthly wage statistics. The managements' argument wab
just the opposite. It maintained that the increase in wages has been
commensurate with increases in productivity; and in support of its
contentiori, it demonstrated average hourly wage statistics. Both the
labour and management were right. It depends on the definition of
wages which is lfdopted. The labours' definition will be considered
more apprQpriate "When wages are viewed as income of workers; and tha t
of management will be more appropriate when wages are viewed as cost
of production.
(il) BPlplayment of Women. The census of 1961 showed that the
percentage of working women in India had increased from 23.30 in 1951
to 27.96 in 1961. It might be concluded from this that the female labout
participation ratio increased sig!,lificantly during the decade. But as a
matter of fact, a major part of this increase was'due tv the mclusion of
u11paid family workers and hOl,lsewives under the nOII)enclature 'workers'
IMP(,RTANCE, LIMlTATIONS .AND FUNCTIONS OF STATISTICS 27
Inappropriate comparison
(1) Deaths in Hospitals. The statement that 'the incident of death
among sick persons is higher in hospitals than at home' is likely to lead
to the conclusion that more patients die in hospita1s than at home due
to lack of proper treatment and care. But this conclusion .turns out to be
completely erroneous if it is borne in mind that in India only seriously
ailing persons are hospitalised.
(il) It was claimed by a teacher that his teaching method was
superior to that of others. He supported argument by showing that all
the students in his .class secured first class: Investigation into the
matter revealed that unlike' others, all his students had secured first class
in previous examination_ and were merit holders. His success was,
therefore, due to better stuff in his class rather than to the superiotity of
his teaching method.
DIS1!RUST OF STATISTICS
and should work like a true re!ltarcher without any prec<1nceived notions
or conclusion about the problem under investigation. It should not be
forgotten, as W. 1. King said that "statistics is a most useful servant but
only of great value to those who understand its proper use.'r
Questions
Questions
1. Discuss the preliminary steps which should be taken before commencing
the work of 'collection' of data.
z. Why is it necessary to determine the object and scope of the enquiry before
planning an investigation i'
3. What is a statistical unit? Is it necessary that the data be homogeneous i'
(B. Com. Agra, 1939).
4. What steps would you take to organise an economic survey of a typical
Indian village?
5. Describe the various stages in conducting a primary economic investigation.
What precautions will you take at each stage i' (M. A. &IJ Punjab, 195 0 )'
. 6. Wh~t is meant by (a) units of collection, and (b) units of analysis? Explain
theIr respective uses. /
7. Differentiate between simple and composite units. Give examples of each.
8. Write a note on the purpose and utility of planning a statisticll investigation.
9. What is meant by degree of accuracy? How should it be determined jI
10. Distinguish between primary and secondary data. 111u~trate your answer
with examples.
Collection of Primary and
Secondary Data 5
Primary and secondar'y data. After the preliminaries discussed in
the last chapter have been gone through, the task of the collection of
data begins. Statistical data, as we have already seen, can be either
primary or secondary. ,Primary data are those which are collected for
the first time and are thus original in character, whereas secondary
data are those which have already been collected by some other per
sons and which have passed through the statistical machine at least
once. Primary data are in ,the shape of raw materials to· which statis
tical methods are applied for the purpose of analysis and interpreta
tion. Secondary data are usually in the shape of finished products since
they have been treated statistically in some form or the other. After
statistical treatment the primary data lose their original shape and become
secondary data. On a closer examination it will be found that the dis
tinction between primary data and secondary data in many cases is one
of degree only. Data which are secondary in the hands of one may be
primary for others. Statistics of agricultural production are secondary
data for the Agriculture Department of a Government, but for the pur
pose of calculation of national income these data are primary, because
they will have to go through further analysis and their shape will not
remain the same.
Factors affecting choice of method. It is obvious that the methods
of the collection of primary data and secondary data would not be exactly
identical because in one case the data have to be originally collected
while in the other the work is of the nature of compilation. There are
various methods of the collection of primary and secondary data and the
choice of the method depends on a number of factors. Nature, object
and scope of the enquiry are the most important tbings on which the
selection of the method depends. The method selected should be
such that it suits the type of enquiry that is being conducted.
Availability of finance is another factor which influences the selec
tion of the method of collection of data. When financial resources at
the disposal of the investigator are scanty he shall have to leave aside
expensive methods even though they are better than others which are
comparatively cheap.
Availability of time has also to he taken into account. Some methods
involve a long duration of enquiry while with others the enquiry can be
conducted in a comparatively shorter duration. The time at the disposal
of the investigator thus affects the selection of the technique by which
data are to be cotlected.
42 RUNDAMENTALS OF STATISTICS
By local reports
The last method of collection of primary data is through local
reports. In this method data are not formally collected by enumerators
but by the local correspondents or agents in their own fashion and to
their own likings. Obviously such data cannot be very reliable and
as such this method is used in those cases where the purpose of in'{es
tigation can be served with rough estimates only and where a high degree
of precision is not necessary. This method has the advantage of being
least expensive and it also saves the botheration usually associated with
statistical investigatioq of other types.
REpRESENTATIVE DATA
As has been pointed out previously a statistical investigation can
be either of census type or of sample type. In a census enquiry all the
units assoCiated with a particular probl~m are taken into account where
as in sample enquiry only a few selected units are studied and on the
basis of such studies attempts are made ~o draw generalisations which'
may be applicable to the whole data. If, for ·example, we have to find
out the average monthly expenditure of the 2000 students residing in the
hostels of the Allahabad University and if we hold a census investigation
we shall have to study the monthly expenditure of each one of these 2000
students. If,. however, we hold sample investigation we shall select say,
200 students out of these 2000 and then study their expenditure. On the
basis of the study of these 200 units (techOlcally called a "sample") we
can draw conclusions which will hold good about the expenditure of all
~he 2000 students (technically called a "universe" or' "population").
The sample is considered to be a representative of universe and if the
sample has been properly selected and if its size is all right. whatever
holds good for the sample should also hold good for the universe. If
the scope of the enquiry is very wide a census investigation would not
only bevery expensive but highly cumbersome also. Moreover·it will
take a very long time and require a large number of enumerators. In
such cases a sample investigation is very suitable. A sample usually
gives representative data and the generalisations made on the basis of
such data usually hold good for the universe.
The most important point, however, is the Sel,ttlon of th, sampl,.
A sample study would give dependable conclusions only if the sampfe is
a true representative of the universe. Broadly speaking there are two
methods by which samples can be selected and they aro:
(1) Deliberate or purposive sampling,
(2) Random or chance sampling.
Deliberate selection or purposive sampling
In deliberate selection or purposive sampling the investigator him
self cho~ses from the uni\rerse few such units which according to his
estimates are best representatives of the population. His selection is
I For a detailed study see chapters on Sampling.
46 PUNDAMENTALS OF STATISTICS
any king is clearly 4/52 and the chance of its being any card of spade is
13/52. This clearly indicates that if the chances of selection of all the
units in a universe are equal, and if from it, selections are made at ran
dom, then the possibility is, that in the sample so selected the various type
of units would be in the same proportion in which they are in the universe.
On this basis it is said that random sampling gives a representative sam
ple which contains the characteristics of the populatlOn. Further, as
has been pointed out earlier, the size of the sample and its accuracy are
also related. In ten tosses of a coin it is not unlikely that seven times it
falls heads and only three times tails. But if there are a 100 tosses there is
a greater chance of heads and tails being equal. If the number of tosses
is 1000 the chance of equal distribution of heads and tails is still greater.
The bigger the size of the sample the greater is the chance of accuracy.
Law of statistical regularity. Thus according to the rules of the
theory of probability, if from the universe a moderately large sized sample
is chosen at random, it is almost certain that on an average the sample so
chosen will have the same characteristics as the universe. It is on this
basis that games of chance are played successfully by a large number of per
sons and the insurance companies are able to insure people against varlOUS
types of calamities. In statistics this law is known as the "Law of Statis
tical Regularity. It is a corollary to the mail} .theory of probability.
The theory ofp,.obability tells us of the mathematical expectation of the success Dr
failure of an event and on this basis the law of statistical regularity tells us that
random selection from the universe is very likely 10 give a representative sample.
Law of inertia of large numbers. We have men'tioned above, that
there is a relationship between the size of a sample and its accuracy.
The larger tht. sample the greater would be the accuracy. The reason
for this lies in the fact that in large numbers the chances of compensatory
action are greater. If in the first ten tosses of a coin there are seven heads
and three tails, it is quite likely that in the next ten tosses the situation
might be reversed and there may be seven tails and three heads. The
larger the number of such experiments the greater are the chances of
one irregularity compensating the other. It is said on this basis that
large numbers have got an inertia or that they are more constant. The
production of wheat in the 'district of Allahabad might show great varia
tions year after year but the production figures of the state ofU. P., would
not. vary much, because if in some districts the crop is above normal it is
very likely that in others it might be below normal. Similarly the pro
duction figures of wheat for the whole of India whould show still less
variations and the figures of world production would show hardly any
significant change. This phenomenon is characterised as the "Law oj
Inertia of Large Numbers" which states that large numbers are relatively
more constant and stable than small ones. It is on the basis of this law
that we say that larger the size of the sample the greater would be its
accuracy.
It should not be concluded from the above discussion that the law
of inertia of large numbers does not allow any change in figures with the
passage of time. All that it means is that large numhers are more constant
COLllECTlON OF PRIMARY AND SECONDARY DATA 49
and stable than small ones. There are no violent fluctuations in large
numbers. After all the figures of world production of wheat do change
from time to time but these changes are not violent and sudden. They
are slow and gradual. Longperiod trend is indicated by large numbers:
they simply ignore the shortperiod regular and irregular fluctuations.
COLLECTION OF SECONDARY DATA
Soqrces of secondary data
We know that secondary data are those which have already been
collected and analysed by someone else, and as such the problems asso·
ciated with the original collection of data do not arise here. Secondary
data may be either published or unpubli~hed. The sources ofpllblished data
are usually : 
(0) Qfficial publications of the central, state and the local govern
ments.
(b) Official publications of the foreign government or interna
tional bodies like the United Nations Organization and its
subsidiary bodies.
(c) Reports and publications of trade associations, chambers of
commerce, b~nks, cooperative societies, stock exchanges, anc
tnlde unions, etc.
(~ l'echnica~ tiade journals like the Economica, Indian
Journal of Economics, Commerce, Capital, etc., and books
and newspapers.
(t) Reports submitted by economists, research scholars"university
bureaus and various other educational associations, et~.
The .fOliren, of ilnpllbli.fhed data are varied, and such materials may
be found with ~cholars and research workers, trade associations, cham
bers of commerce, labour b~eaus, etc. Many enquiries of a private
nature are conducted by these bodies and these findings are not pub
lished and are usually ineant for the conswnption of their members only.
12.. What is'sampling' and what are its uses. Expltin how would you design
a sample survey to estimate an average size of holding in locality.
(M. A. A".4. 1947).
13. "It is never safe to take published statistics at their face value without know
ing their meanings and limitations and it is always necessary to criticise the arguments
that ~n be based on them." (BollPlt}!). Elucidate. CB. Com. Allahabad, 1946).
24. Why is it neeessaey to sctutinizc and edit secondary data before its usc?
What' precautions would you take before ',sing such statistics ?
IS. Write short notes on :
(a) Theory of Probability.
(b) Law of Statistical Regulatlty.
(I) Law of In.ertla of Large Num~ets.
2.6. "In any sample survey there arc many sources of errots. A perfect survey'
is a myth". Discuss the ~tatement.
z7. Suppose you went to study the changes in the e#cnt of indebtedness of
middleclass people of Allahabad for the next five' years. 'How would you proceed
to do it 7 Explain all the protesses.  (8. Com. BtlnOral, 19S5).
z8. Descrlbe the procedure you wouJd adopt In order to obtain the necessary
Information for introducing compulsory primary education in a big city.
(B. Com. Btztloral, 19'2.).
19. "Statistics, especially other people's statistics, are full of pitfalls for the user".
(Conner) Do you agree with this statement ? '
50. "Samples arc devices for leaming about large maS$es by observ"jng a few
individual..... (Sneti~_).
Elucidate the above statement.
31. How would 70U conduct an enquiry about 'Payment of Wa~ in an in
dustry P On what pOlOts would it be necessary for you to he clear before actually
beginning investigatIon work? (M. Com. Agra,19S7)'
31. How would you organise a marketing survey of the fruit trade in a particular
region wIth a view to making suggestions for its development? Explain the pro
cedUre you Would fol~ow step by step. (M. Com. Agra, 1956).
Accuracy. Approximation
And Errors 6
Btlitin,g oj data. After collection of data the next step in a statistical
investigation 15 the ·scrutiny of the Ct?llected figures. This is technically
called ;tlitiltg of data. It is a necessary step as in most cases the collected
data contain various types of mistakes and errors. It is quite likely
that some question has been misunderstood by ~he informants, and if it
is so, this part of the data has to be collected afresh, or it may be, that
answers to a particu1a.s: question are, in general, vague, and it is difficult
to chaw inferences from them, or some of the schedules and question..
naires are so haphazardly blled that it is necessary to reject them. It is
also likely that some of the investigators were biased and the answers
&ned by them or the data collected by them show unmistakable signs of
their prejudices. In all such cases the collected data have to be edited
and modified. However, it should be, clearly understood that undue
tampering of data should never be doae. If only a few schedules are
defective they can be omitted but this too should be done very carefully.
,"In some cases the omission of a few schedules would not affect the general
conclusions, while in others this may entirely change the complexion
of the problem under study. As has been pointed out earlier, absolute
accuracy is neither 'possible nor essential but decision about the extent
to which irutccuracles, approximations and errors can be allowed, is a
very important step in statistical analysis and we shall study these things
in the fOllowing pages.
ACCURACY
coin may fall heads in:3 tosses,out of 4 but in 3000 tosses the number of
heads and tails are bound to be more or less equal. There is a general
tendency everywhere to give ages in round figures. It is another
example of unbiased error. If some people have, in this process, over
estimated their ages, others might have underestimated them. A person
of29 years of age may call himself of 3Q but it is also likely that a person
of 31 years may call nimself of 30, and in such a case the errors cancel
each other.
The following table will illustrate the characteristics of the biased
and unbiased errors : 
TABLE I
Bialed and u'lbialed e'f'()'J
Exact number
Correct to
nearest
I Absolute
"error'"
Correct to
next 1000
Absolute
error
1000 unbiased and over biased
50,241 50 +241 51 759
60,507 61 493 61 493
49,361 49 +361 50 639
61,427 61 +427 62 573
53,764 54 2.36 54 236
48,090 48 + 90 49 910
50,460 50 +460 51 540
96,670 97 330 97 330
I
60,250 60 +250 I
61 750
Total 5,30,770 I 530 +770 536 5230
When figures are estimated correct to the nearest thousand the
error is an unbiased one. The unbiased absolute error in the above
ngures, as shown in column 3, is only 770 and the relative error is
5';~,~70=0.001453. The errors are negligib1~.
When figures arc estimated correct to the next one thousand and
over, the error is a biased one. The biased absolute errOl in the
above case is  52.30 as shown in column 5 and the relative error is
5~~70 =0.00975. These errors are comparatively much more than
in the previous case and cannot be safely ignored.
Brrtt,I in !lIliltiplication, dir·jIion, ete. However, it should bt:'
remembered that neithet are unbiased errors always compensatini>:
'nor biased errors always cumulative. Where items have to be added
together biased errors would no doubt be cumulative and unbiased
ones compensating; but where items have to be subtracted the situatio.lll
is just the reverse, and biased errors would be smaller in size than the
unbiased ones. If ~'o items arc multiplied together unbiased errors
60 FUNDAMENTALS OP STATISTICS
would give a better estimate than the biased ones. But if the items
are divided and the algebraic signs of the two figures are the same
(as is the case in biased errors) the result would be quite close to the
true valu~ ;and if the signs are opposite (as is the case in unbiased error$)
the reo;ults would be away from the true value. In other words, ordina
rily, unbiased errors ar.e compensating only when items have to be added
or multiplied but when the items have to be subtracted or divided
biased errors would give results closer to the true value than the; results
given by unbiased errors. .
These points can· be illustrated as follows : 
True Value Estimated value with Estimated value with
biased error unbiased error
(a) 100 99 99
(b) 200 197 202
(i) Biased errOl in(d)""l ana unbiased euo! c= 1
(ii) Biased errOl in (b) 3 and unbiased etror ""  2
(iii) Biased ~rror of (a+b) or 300 = (300  296) or 4 and
unbiased error (300301) "'"  1
(iv) Biased errOl fo! (ba) O! (100) ... (10098) "" 2 and
unbiased error(100103) 3
(v) Biased error for (axb) or 20,000=(20,00019503)497 and
the unbiased error (20,00019.998) ==21
(VI) Biased error for (b+a) or (200+ 100) or 2 ""
197) . ( 202)
( 2 99 ... 0.01. and the unbIased error 2 99
0.04.
Thus it is clear that in addition and multiplIcation the biased
errors are more than the unbiased ones whereas in subtraction and
division the position is reverse and the unbiased errors are more thaD
the biased ones.
Estimation ot errors
In most of the statistical investigations in actual practice the exact
figures or the true values are not known. In such cases we cannot
measure the absolute or the relative error. But it is possible to estimate
them.
EsJimation of IInbiased e"Orl. Unbiased errors can be estimated
without much difficulty in most of the cases. In the illustration in
Table I if the actual figures were not known. all we could say was,
that the total of the figures (correct to nearest 1000) was 5,30,000.
If the absolute error in the above figures, is to be estimated then in
each of the nine items it can range between 0 and 499. It will be
zero if the actual number was in exact thousands. and in such a case
the actual and the approximated figures would be the same. The
maximum error in any figure can be 499 because the approximated
figure will be discarding all numbers less than 500 and adding all
ACCURACY, APPROXIMATION AND ERRORS
Questions
1. Write a note on the c;ditlng of primary and secondary data for the purposes of
analysis and interpreta~lon.
2. The statistician who desires to safeguard. his analysis and result8 from im.
perfections entering at the very start should rest his choice among sources upon a test
of reliability rather than upon accessibility and convenience.
Expaod this statement so as to bring out clearly the way in which sources should
be used. eM. Com. LtlcJ:nolP, 1943)'
3. Discuss the standard of accuracy required in statistical calculations. To what
extent should approximations be used? (M. A. Agra, 1949)'
4. What precautions should be taken in the use of published statistics.
I (B. Com. Agro, 1949)'
5. Mention the advantages of approximation of Statistics. What degree of
accuracy is generally required in each statistical investigation?
(M. Com. Rajpulono, 1951).
6. What are the different ways of approximating figures ? Discuss the merita
of each.
7. To what extent call figures be safely approximated in statistical analysis?
How should such ligures be written i'
8. (0) Discuss the sources of errors in statistics and their effects.
(b) State the important methods of approximation and their utility in
statistics. (B. Com. Agra. 1940).
9. In what way does a statistical error differ from a 'mistake? What classC1I of
¢uorsarethere and how may they be measured? (B. Com., Allababad, 1943)'
10. Discuss the various types of errors likel y to creep into statisl:ical investigations
and suggest how to avoid or correct them. (B. Com. Agro, 1949).
. u. Of the biased errors the statistician should have none : but of the unbiaaed
ones the more the merrier, notwithstanding that they are also errors. Elucidate'.
12.. In framing statistical estimates we are not so definite as the Modem Traveller
who:
........ knew the weather to a T.
Longitude to a degree.
The Latitude exactly,"
Explain the bearing of the above, on the degree of accuracy desired in statistical
estimates as distinguished from the estimates of the more exact sciences.
eM. A. PlInjab. 195Z).
15. Show how biased errors are generally cumulative and unbiased ones com
pensating. Are there any exceptions to this general rule?
14. Discuss the various methods of estimating biased and unbiased errors botb
abSolutely and relatively.
1 S. Distinguish between
(a) Absolute and relative errors and
(b) Biased and unbiased errors.
Discuss the effects of these errors and explain the steps that are taken to meet the
effects. (B, Com. Agra, 1938).
,
Classification, Seriation and
Tabulation
7
CLASSIPICATION
Need "nd meaning
The data which are collected or compiled in accordance with the
rules and methods discussed in the preceding chapter are usually very
voluminous and large in quantity. As such they are not directly fit for
analysis or interpretation. If, for example, the figures of the expenses
of 2,000 students residing in Allahabad University hostels are before
us, as collected, it would not be possible to draw any inferences from
them because for purposes of comparison. analysis and interpretation
it is essential that the data are in a condensed form. Further. it i$.
a]so essential that the likes must be separated from the unlikes. All
the 2.000 students, no doubt. are alike in the sense that all of them
belong to a particular university and live in hostel but they differ in
other respects. Some may be living in singleseated rooms atld others
in double or trebleseated rooms; some may be living in costlier hostel
and others in comparatively cheaper ones; some may be having their
privat~ messing arrangements while others may have joined the common
mess. Thus, even though the data collected relate to one set of persons
yet there may be many types of dissimilarities even within this ~roup.
For the purpose of analysis and interpretation. data have to be d1vided
in homogeneous groups. In order to remove these defectsof volume
and heterogeneity;statistical data are fablliated with a view to present
a condensed and homogeneous picture. But before the tabulation of
data, it is necessary to arrange them in homogeneous groups so that
there may be.no difficulty in tabulation. The proceu of arranging data in
grollps or claue! according to relemblances and limilarities is technicallY called
Cla.r.rification. Thus, by classificatioQ we try to strike a note o(homoge
neity in the heterogeneous elements of the collected inform~tion. Classi
fication gives expression to the similarities which may be found in the
diversity of individual units. In classification of data units having
a common characteristic are placed in one class and in this fashion the
whole data are divided into a number of classes. Even after classifi.cation
the !ltatistical data are not fit for comparison and interpretation but this
is certainly the first step towards the tabulation of data. After tabula
tion of data statistical analysis and interpretation are possible. Classi
fication is a preliminary to tabulation and it prepares the ground for
proper presentation of statistical facts.
Characteristics of an ideal classific~tion
Despite the fact that classification is a very important preliminary
in a stati~tical analysis no hard and fast rules can be laid down for it.
64 PUNDAMENTAt.s lOP STATISTICS
TABLE 1
I II III IV
010 oand under 10 09 5
1020 10 and under 20 1019 15
2030 20 and under 30 2029 25
In the first method, items whose values are just 10 or 20 ca'
be classified either in 010 group and 1020 group respectively or i
1020 and 2030 classes respectively. Usually in such cases the iteJ
is classified in the next higher class so that the item whose value
exactly 10 would come in 1020 group. In the second method, tho
point is made clear. Items whose values are Ius than 10 woul
be in the 010 class interval. This is the exclusive method of c1as!'
fication. In exclusive method the items whose values are equ
to the upper limit of a class are grouped in the next higher dar
68 PUNDAl(BN'tALS O~ STATImCS
In other words, the upper litnit of a class is excluded and items wi~
values less than the upper limit are taken into account. As against
this the third method is in&ms;v,. In it the upper limit is alSo in
cluded in the class interval. This method. in reality, is like the second
method as 09 means 0 and undc:r 10. To emphasise this point sOJ:QC"
times the class interval is written as 09.99. The fourth method indicates
only the midpbints.
Cotm#ng I/;, nllmb,r of it'lIIl in quI; t/all. After deciding the number
of classes. their tnagnitude and class limits, the next thing to be done
is to count the number of items falling in each class. This can be ·done
in any of the following ways : 
(a) .B;r IaI!J ·shl,ls. Under this method, the class intervals ~re
written on a sheet of paper (called Tally Sheet) and for .each item a
stroke is marked against the class interval in which it falls. Usually
after every four strokes in a class, the fifth item is iudicated by drawing
a horizontal or diagonal line over or through the strokes. These groups
of five are eas} to count. Data sotted in such a manner would give
the following type of tally sheet.
TABLE 2
Nllmb". of 1II4f1u oblai",J b" 80 sIIIt/",tl
(Tally Sheet)

MArks
I 'I To'"
2030 IIIl nn II 12
3040 IiII fin lIn III 18
4050 UlI IIil iIII IIII IIII nIl 1 31
5060· lItt nrr 10
6070 rill IIII 9
Total 80
TABLE 6
MarRs,obtained by 100 stllJents in statistics (slxwisI)
\
Males Females Total
3040 8 6 14
4050 16 10 26
5060 14 16 30
6070 12 8 20
7080 6 4 10
Total S6 44 100
TABLE 8
Marks obtained 1!7 stfltknts (sex1IIise, on tbl btlns oj ~"i/
eonJiti01ls and resitkn&u)
Number of Students
Males
~__
Residence Marks
Hostellers
3040
4050
5060
I
i
I I
6070
Totali l l
i i
_ _ _ _ 70801 _ _ _ _ _ _ _ _ _ I__i__
,
I_
74 FUNDAMENTALS OF STATISnCS
5060 I
6070 I
70eO

Total
1
   I  I~
I I .l I
Total I (
3040
4059
5060 I
I ,
6070 I
7080
Grand
Total
j, 
i
~  ,1
I
I I
The above table gIves information about a large number of inter
related questions regarding students, namely, about the marks obtained
sexwise distribution, civil conditions and residence. Manifold tables
are very useful in presenting population census data.
Rules of 'tabulation
Having discussed the meaning, importance and ~es of tabulation,
it is necessary to lay down certain rules regarding construction of tables.
The following general rules should be observed in the copstruction
of tables : 
1. The table should be precise and easy to understand. It should
not be necessary to go throJ.l.gh footnotes or explanation to properly
understand a table. .
2. If the data are very large they should not be crowded in a
single table. This would increase the chances of mistakes and would
make the table unwieldy and inconvenient. Such data can be presented
in a number of tables. Each table should be complete in itself and
should serve a particular purpose.
3. The table should suit the size of the paper and, therefore,
the width of the columns should be decided beforehand.
4. There should be thick lines to separate the data under one
class, from the data under another class and the lines separating the
subdivisions of classes should be comparatively thin.
5. The number of main headings should be few though there
is no harm if the number of subheadings is large. This will he 'p in
understanding the main points of the table.
6, Captions, headings or subheadings of columns, and sub
headings and subheadings of rows must be selfexplanatory.
7. Those columns whose data are to be compared should be
kept side by side. Similarly percentages, totals and averages must also
be kept close to tl;le data.
CLASSIPICA'l'ION, SElUA'l'ION AND TAlIOLA'l'ION 75
8. As far as possible figures should be approximated before
tabulation. This would reduce unnecessary details.
9. The units of measurement under each heading or subheading
must always be indicated.
10. Total of rows should be placed in the extreme right column,
though sometimes they are placed in the first column after the vertical
captions on the left. The totals of columns should ordinarily be placed
at the foot though in some cases it is helpful to place them at the top of
the table.
11. Items should be arranged either in alphabetical, chronological
or geographical order or according to si2:e, importance, emphasis or
casual relationship to facilitate comparison.
12. If certain ii gures are to be emphasised they s!.ould be in dis
tinctive type or in a "box" or "circle" or between thick lines.
13. When percentages are given side by side with original figures
they should be in a separate typepreferably italics.
14. If some portion of collected data cannot be classified in any
class or division a miscellaneous class should be' created and the data
shown in it.
15. There should be a proper title to each table. It should tell
what exactly the table presents.
Besides the rules mentioned above, the figures should be scruti
nized before being entered in a table. Below a table, should be given
the method of collection, sources of data, general results obtained and
their limitations. The probable error should also be mentioned.
It Rhould be remembered that there cannot be any rigidity about
these rules. Tables must suit the needs and requirements of an in
ve~tigation. Bowley bas correctly said that "in collection and tabu
lation common sense is the chief requisite and experience the chief
teacher."
Questions
I. What do you understand by classification, seriation and tabulation? Dis.
cuss their importance in a statistical analysis.
z. "Classification is the process of arranging things (either actually or notionaliy)
in groups or classes according to their resemblances and affinities giving expression
to the unity of attributes that may subsist amongst a diversity of individuals!'
Elucidate the above'statement. ' (B. Com. Allahabad, 1947).
3. How would you proceed to classify the observations made and what points·
will you take into consideration in tabulating them? Mention the kinds of tables
generally used. (B. Com. Agra, 1941)
4 What precautions would you take in tabulating your data ?
(B. Com. Agra, 1933).
1. "In collection and tabulatiQn common sense is the chief requisite and ex·
perience the chief teacher."Bowley.
What precautions in your opinion are necessary to avoid statistical errors in the
collection and computation of primacy' data? (M. A. Agra. 1940).
76 PONDAHBNl'ALS OF STA'l'ISTICs
6 •• DlacUSI the main functions and importaDcc 0.£ tabulation in a schcmc in in
vcatJgation. Prepare blank tables to show distribution of students of a coUc~ accord.
Ing to age, class and residence for arranging (a) Physical training and (b) Tutorial classes.
7. (or) Draw up a blank table with suitablc beadings, spacings, table of lincs.
etc:. in which could be shown the number and tonnage of ships enteted and cleared
at ~ in India for 10 years distinguishing steam and sailing vessels anel also tbose
with eatgOCB from those in ballast.
(b) What do you mean by "A statistical Unit of Measurement:; Give a
auItab1e illlJ8tfttion. (B. CO/JI. H()JIs. AiMDTII, 194%)'
·8 Draw "P two independent blank tablcs giving rows,columns and totals in
eacb ease swnmatlzing thc dCtails about thc members of a number of families distingue.
shing males from females, earners from dependants and adults from chUdren.,
g. Draw up in detail, with propct attentioCl to soaclng double lines, etc.,
and showing all subtotals, a blank table in whIch coulcl bc entered the numbers
occupied in sil[ Industries on two dates, distinguishing males from females, and
ImODI the latter single, married and widowed. (M. A. AlIi/., (940)
10. &plain how you would tabulate IItatistics of death from principal diseases
by 1CZeI, in two dUfcrent provincea in India for a period to five years.
(M. COllI. Ct:Iflllla, 19")'
U. Prcpa:rc a table with a proper title, divisions and subdivisions to represent
the following heads of !nformation : 
(a) ~rt of cotton piccegoods from India.
(b) To BlUm.. China, Java, Iran, lraAJ.
(t) Amount of piec.egoOda to each country.
Value of piecegoOds to each country.
Prom 193940 to 194546 year by year.
1" I,.I,.a.,
17. a" 19. 11,,14. 'I,.
i6.17. I,.
Ja, 18. ale 15. 20,
10. 22, .17, 21, 19. 19. 16. 18, II, 18, 10 •
11.
19. 17. t6. 14.
".,
10
I,
10
2ll ..... ,
0 n
45
.. ..
57 10
20
I,
50
H
1,
20
"
IS
S
"
78 FUNDAMENTALS OF S'l'ATISTICS
.24. In an enq~find out relation between age and monthly wages, the fo1.
lowmg data were co from 40 mill workers :
S. No. Age(Ycats) Wagc(Ra.} S. No. Agc(ycars) Wagc(Ra.)
I. 37 81 :11 41 89
1. al 100 aa 38 9a
3· 49 101 a3 41 8I
4· ,6 109 24 37 140
S· 57 (02. as 4S 94
6. 34 104 a6 4 6 .n9
7· 25 8( 2.7 28 99
8. 48 tit, a8 43 109
9. 51 100 2.9 41 92.
10. 41 89 30 31 no
n. 4, 15' 31 5S tao
12.. H 101 32 42 115
13· 38 99 H 4Q 119
14· 41 U3 H 4S 90
IS· 31 100 3S So 76
16. 30 99 56 24 IS8
17· 55 130 37 :n 76
til. 30 159 38 u 76
19· 2.9 90 59 al 94
ao. u 79 40 58 89
Tabulate the above data in the following form ! 
13, 81, 58, 81 SS, 7S, 61, 70, 84, 84, 81. 87, 67, 6" 62, 62, 61, S9, 5S, 57, 75,
72, 84, 91, 87, 76, 43, 83, 40 , 73, 86, 73, 43, 33, 76, 95, 73, 65, 77, 72, 72, 29,
43 85, 4%, 80, 75, 85,62, 57, 64, 70,95, 57, 74, ,0. 7S, 49, 55,64, 92, 73, 73, 96,.
69 51, 22, 7S, 80, 36, 70 8S, 47, 69,63, 53, 91, H. 69, 30. (AndbrfJ, B. A., 1914)
31. Tabulate the following data by taking 10 as the cIas.·lnterval :
30, 45, 55, 65, 60, 90, lIS. 8s. 95> 100, 95, '65, 75. 8S, IZS, lIO, 87, 6"
100, lIS, 65, 60, 75, 9S, 130, 95, 125, II5, 6" 70, 9" 8" 6S, 60, 80, 8"
75, 95. 55, 45, 35, 45, 40, 85, 135, 140, 9S, 65, 4S, 3', U5, 90,80. IZS, 130,
~5. 90, 100, 95, 85, 85, uo, II5, 40, 35, 12 5, 35, lOS, 7',45,
(B. CtIIII., Vwa"" 1964).
Ratios, Percentages And
Logarithms 8
RAnos. AND PERCBNTAGBS
Need. Mter the statistical data have been collected, edited, cla$sifi
ed and tabulated, they are ready for further statistical analysis. In the
process of classification· and tabulation the size of the data is considerably·
reduced and a large number of figures are condensed. This is done with
a view to make the data easily understandable and fit for analysis and
interpretation. But even after condensation, data might be fairly large
in quantity and the figures may be very big and unwieldy. It·may not be
easy to draw inferences from them. To remove this difficulty, sometimes,
ratios and percentages are calculated so that big figures are reduced to
small ones and 11. relative study of the data is possible. Absolute figures
ue uafit for relative study and in statistical analysis where most of the
data ~ compared relatively, absolute figures, even though they arc:
esset;ltial do not have very great· significan~.
Derivatives
Ratios and percentages are obtained by a combination of two or
more figures. They are J,ri",J from the absolute figure~ collected for the
putpose of investigation, and that is why. they are sometimes referred
to as utkriflflnfllt." Derivative is a quantity. obtained by the combination
of two or more figures. In a statistical analysis a vanety of derivatives
are used. Ratios, percentages, rates, coefficients, measures of central
tendency and meas~s of dispersion., skewness, kurtosis are all statistical
derivatives. Ratios and percentages are nlJlpl, JlriIJ4/iWI while measures
of central tendency or averages of the first order and measures of dis
persion.and skewness or averages of the second order arc ~oClpl,x tllrilJa
lilll", as in their calculation a number of statistical processes nave to be
undetgone. Simple derivatives may be either toer_1I ()1' mlmJilk1h.
When two or more parts of a universe are.compared with each other ~th
the help of ratios or percentages these derivatives .are called coordinate
derivatives, and when a patt of the universe is co~=d with the Whole
of the universe derivatives are said to be subor teo The ratio of
females and males in a population is an example of coordinate derivative
and the ratio of females to the totall?opulation is a subordinate derivalive.
Ratiot. In the simplestjOSSlble form, a ratio is t\ quotient or the
numerical quantity obtaine by dividing One figure by another. 1£
800 is divided by 100 the quotient is 8. Here 800 has been compared
with 100 which 1S the base in this case. In other words, 800 is to 100
tlS 8 is tQ 1. Or 800: 100: : 8 : 1. 'the process reduces the s* of the
numben and thU9 facilitates comparison. Instead of saying that the
RATIOS, PERCENTAGES AND LOGAllI'1IHMS 8"1
"
82 FUNDAMENTALS OF STATISTICS
9~ + 10Xl00
100ft, rise ..· price becomes :I 100  105
2OXl00
20% fall ..· price becomes \05 
100  85
Thus according to this method the prices went up 10% over tbe
)riginal price.
Using the second supposition : 
Original price 100
95 X 100
5% fall /. price: becomes
100 = 95
10% rise ·.. price becomes 110x 95
100
= 104.5
th!e age of 45, and if dur.ing this period. on the basis of eurre nt fettility
rates, they give birth to 2412 femAle child.r~ the female g,ross reproduo
tion rate 'Would be 2.412. Reproduction rates are generally apressed in
terms of UQity. It means that on. the assumptions mentioJ1ed above ·for
each mother at the present moment there would be 2.412 mothers in
future.
Thus
Number of female children expected to be born to
Female gron "\000 newly born females on the basis of current
,.
reproduction fertility Without mortality
tooo
N" "prodlttliOfl rat,. As bas been noted above the gross repro
d~ction rate does not take into account" the factor of mortality. The
net reproduction rate takes intO account this factor also. Female net
reproduction rate tells us about the nwnbcr of female children ~
to be bom to 1000 newly hom females <;)n the basis of ~nt fertility
and mortality rates. It is quite obvious that neheproduction rate 'Would
be less than the gross reproduction rate. 1000 newly born.females 'Would
in actual practice not remain 1000 at the age of say 16. Some of them
would die. Supposing their number is reduced from 1000 to 800 and'
suppose further that the lCUttent fertility rate for the age of 16 is 20 per
toOO then the total 'number of children bom to them would not be 20
but 16 only~, ; If the sex ratio'is 50 ; 50 then only 8 female children.would
be taken into account for the (a}.~tion of female net reproductioll
rate. In the calculat;i.on of gross repr~uction rate 10 female children
'Would have been taken into accoWlt. In ~ age group ofwomen in the
dlildbearing age period, the numbC:r would go on declining due to morta
lity .and ·the number of children bOrn woUld also be reduced. I{ sUppose
the total of 2,412 children (preswp~d in the calcUlation of gross reproduc
tion rate) comes down to 1411. female ,net reproduction rate would be
1..411; It shows that for every present mother there would be 1.4~1
future mothers or, in other words, the populadon is growing. If net
reptoduction rate is just 1 it indicates a stationary population in fume
and if it is less than 1 it is a sign ,of declining popUlation.
'rhus
Number of female childl:en expected to be born to
Female net re 1000 newly bom females on the basis of cutteDt
production fertility and mortality rate!>
rate
1000
• In the same way male reproduction rates can be calculated by taking
into, account the fathers and the number of male children espected to be
bom.. Combined reproduction rates for males and females can be cal
culated by taking into account the population (both males and females)
and the number of children (both males and femaIes) expected to be bom.
90 FUNDAMENTALS OF STATISTICS
LoGARITHMS
'&lIJIIpl, 1
BxIJlllP" 11
Divide .0009 by .008
(II) log ••0009
(b) • log•• 008
 4:9542
'J.9031
log. 41og. b T.OSt,t
Antilog. 1.0511
: •• 0009+.008

po"",.
To ,.tUIIII tillmb" 10 II
.1125
.1125
In order to raise a number to a power
of
multiply the log. of the number by the exponent the power and find
out the antilog.
Thus aaAnti.llog. (nxlog. a)
Exampll1
Find out toe vslue of (95.2)~
log. 95.2  1.9786
x4
7.9144
Antilog. 7.9144 8204()(){)('
:. (95.2)4 82040000
'&a111p/,11
Find out the cube of .0991
log. of .0991  2.9961
x3
4.9883
Antilog of4.9883 .0009727
:. (0991)1  .00097Z7
No/,. In th~ second example above 2 which is carried forward
£rom the mantissa to the characteristic is subtracted &om the product
of 3 and 2 and thus the chancteristic of the product is .f."
To6:tlrll# tbI rool ojIIl1l1111b". To extract the root ora numbet divi
de the log. of the number by the index of the' root and. find out the anti
log.
Thul
,\laantiIog (10: a)
&alllpill
Find out the value of ~
log. 92.4  1.9657
Divided by 3
Antilo$~_ .6552
 1.~6~ .6552
4.519
:.{j92.4 . 4.519
94 FUNDAMENTALS 011' S'I'ATIS'rICS
Example II
Find out the value of 7 v.00481
log. .00487 3.6875
_ To divide 3.6875 by 7 we shall have to write it as 7+4.6875 because
in 3..6875 the characteristic is negative and the mantissa is positive and
division is not possible with the figures as they are.
So
7+4.6875+7
Antilog. 1];696
:.'11' .0
The utility of logarithms is very great in statistical calculations.
As has been said in the beginning, they help us in studying propor
tionate changes. 10 to 100 is the same degree of relative chnge 9S 100
to 1000. In a.bsolute figures these changes are different but jf we find
out their logarithms, they would be 1 and 2 (for 10 aug 100 respectively)
and 2 and 3 (for 100 and 1000 respectively) indicating that the relative
changes in the two cases are identical.
Questions
I. Defin e a statistical derivative and discuss its utility in statistical analysis.
2. What is meant by coordinate and subordinate derivatives ? Illustrate with
examples.
3. "Wh"t precautions are necessary in the use of ratios and percentages?
4. What do you understand by a crudebirth rate? Is it an accurate measure:
ment of the population growth of a locality? If not, how can it be modified to
give better results?
~. What is a "standard population"? How are birth rates and death rates
standardized ?
6. What do you understand by general fertility rate? Is it an improvement
over standardized birth rates ? '
1. What statistical data are necessary for the calculation of netreproductlon
rate? What is the deficiency in the existing Indian data in this respect.
(M.A.AIIJ .• 1951).
8. What is netreproductionrate ? Explail" with the help of an example the
method of calculating it.
9. What are the various ways of the measurement of population growth? In
this connection discuss in detail the calculation of netreproductionrate.
CM. Com. Allahabad, 1952).
10. Point out the ambiguity or mistake, if any, in the following statements :_
(a)· 99% of the people who drink, die before reaching 100 years of age.
I Therefore, drinking is bad for longe\'ity.
(b) The rate of increase in the number of cows in India is greater than the
population,' Then'fore, the people of India are now getting more milk per head.
(M. A. Palna, 1943).
RATIOS, PERCEN'l'~GaS AND WGARITHWI 95
II. Below is given the fertility rate for 1000 women, by their age group for a
certain country for 19;6 : 
Age GrollP Per IililJ rale per AglI GrfJfIP FerlililJ rale per
1000 women 1000 women
Years Yeara
I62c' 19 3640 IS7
,"[25 173 4145 67
26;0 "H 465 0 9
3 1 35 201
Assuming that ratio of female babies to total births for tbe country and year
concerned is 48.8%. calculate the grossreproductionrate for the country and explain
what this rate means.
u. ~he following are the deathrates. per thousand, per annum, of two towns
in a certaln Y,car : 
Town A TownB
Ages Death Death
(years) Population Deaths rate per
1000
PopulatIon D.:aths
I ~~e:~
Under a
210
3000
10000
191.
70
64.0
7. 0
SOOO
12.000
300
78
I
I
60.0
6.5
10"0 10000 40 4.0 10000 38 S·8
2,060 ;"5 00 1.60 8.0 1Sooo 190 7. 6
60 & over SSoo 510 60.0 8000 460 SM
1\jJI"7(,,,.noo
107" I ~7; 6:;"0000 J lO66 17.71
(a) For each age group the deathrate of town A is greater than that of
town B but the reverse is the case when all agegroup. are grouped together. Why
is it 80 r
(b) Calculate the standardized deathrate for toWn B taking the popUlation
of toWn A as the standard. (B. COf1I. Andbra, 1944), (M. A. Punjab, 1954).
13· Compute crudc and standardized deathrates in the folloWing and find out
if the local population has a higher or lower deathrate:_
I
80
Above 6) 8000 320 8000 400
14· Wh'lt are absolute and relative measurements r E,.plain in this Cllnnectlol'l
the URe of ratios, ,Jercentages and coefficients. (B. Com. " 6"0. 1941).
IS· Write short notes on: (a) Derivative series. (b) Complex derivatives.
(r) Total fertility rate. (d) Male.rrprodnct)rm.rat... (e) P2llacks in the use (,f
ratiM and percentages.
96 rl1NDAHBN'rALS OF ITA'l'ISTICI
A B
......__II
No. of ClDcUdateI Suc:ceuful No. 0( c:aocU Sac:cellCuJ
appeared data appeared
M. Sc.
M. A.
60 ,0
90
zoo
Z40
160
190
160
_____ :'0:. __
ToU! 800 '90 800 '90
(II y,.,.. T. D. C•• R4.. 1961).
17. 'l11e following table gives the result of ceftaJn eumlnatiODll of tluee ani
1'CtI1tb fa the JCIU 19'7. Whfch Ja the best otliveftlty P
M.A.
r= Percentages resultlln the otliveftlty
A B C
 ... 
7
,  0
M.Sc:.
I
70
B.A. 80
B.Bc. 70
B.Com. 60
(M. A. c.Jmtlti)
Measures of Central
Tendency 9
Need and meaning
We have discussed in the last chapter the utility of various statistical
derivatives like ratios, percentages and rates, etc., in reducing the quantum
of data and also in reducing the size of the figures. But these derivatives
ate not enough for the proper condensation of figures and sometimes
there are many fallacies in their use. Condensation of data is nece,ssary
in statistical analysis because a large number of big figures are 1l0t only
confusing to prind but difficult to analyse also. In order to retiRGt Ib,
complexity of data and to make thelll GOllIparable it is essential that the,various
phenomena which are being compared are reduced to ,one figure each.
If, ,for example, a comparison is made between the marks obtained by a
group of 200 students belonging to a university and the marks obtained
by another group of 200 students belonging to another university, it
would be impossible to' arrive at any conclusion, if the two series relating
to these marks are directly compared. On the other hand, if each of these
series is repre_sented by one figure, comp~n 'Would 'be an extremely
,easr affair. ,It is ,obvious tnat a figure which,is used to represent a whole
senes should neither have the lowcst value in the series nor the 'highest
value, but a value somewhere between.these two limits, possibly in the
centre, where most of the items of the, $eries cluster. Such figures are
,called MealllriS' of Central TendellGJ or A_ages. An average represents
a whole series an4 as such, its value always lies between the minimum
and maximum values and generally 'it is located in the centre or middle
of the d i s t r i b u t i o n . ' ,
ObjeGts. Measures of central tendency or averages gipe a bird'l ey,
iii,., of/he hllge lIIalS ofJlatistitai tItsta w!Jith'Ordillari(y are not tanlJ jntelligible.
They are devices to aid the human mind 'in grasping the true significance
of large aggregates of facts and m~surements. They set aside the un
,necessary details of the data and put'forward a concise picture of the com
plex phenomena under investigation. If the human mind was capable
of grasping all the details of large nu~bers and their interrelationships,
averages would have no utility. But the human mind is not capable of
this. It is impossible to keep in mind, say, the details of heights, weights,
incomes and expenditures of even 200 students, what to talk of big figures.
This difficulty of keeping all the details in mind necessitates the use of
averages not only for grasping the central theme of a data, but also for the
.facility of comparison and further analysis. Averages are thus extre""lJ
/;elpflll for pllrPdJlS of (olllpariJon.
w~ jj an aperage a reprefen/alive. The reason why ao average is
a valid representative of a series lies in the fa.ct that ordinarily most of the
7
98 FUNDAMENTALs OF STATISTICS
items of a series cluster in the middle. On the extreme ends the number
of items is very little. In a population of 10,000 adults there would
hardly be any person whO" is 2 ft. high or whose height is above 8 'ft.
There will be a smaU range within which these values would vary,
say 5 ft. to 6' 5", Even within this range a large number of persons
wou1d have a heighl: between say,S' 5· to 5' 10·. In other class intervals
of height the number of persons would be comparatively small. Under
such circumstances if we conclude that the height of this particular
group of persons would be represented by, say 5' 7', we can reasonably
be sure that this figure would, for aU practical purposes, give us a
satisfactory conclusion. This average would satisfactorily represent
the whole group of figures from which it has been calculated. Ordinarily,
items with values less than the average cancel the items whose values
are more than the average. Thus the average of 3, 4 and 5 is 4. The
item before it is one less in value and the item after it is one more in
value, than the average figure of 4. Thus the two deviations 'If 1
and +1 cancel each other.
Typical and descriptive averages. It should, however, be noted,
that a serie .. can be represented by an average only if the average is
really typical. Sometimes the average which is calculated is not truly
representative of the series. In such cases it should not be used to
represent the series. Averages which are representative are called
Typical Averages and those which are 'not representative aQ.d have only
a theoretical value are called Descriptive averages.
CharacteristicS of a representative average. In whatever way we define
an average it is necessary to keep in mind the fact that an average is
a particular value in a variable and as such it has to be expressed in the
same unit in which the series is. If the variable refers to the weights
of students in pounds the average would also be weight and in pounds.
Similarly the average of ratios and percentages should be in ratios and
percentages only. Averages are meant for condensing a frequency
distribution in one figure and it is necessary that they are in the same
unit in which the original series is., At thi's stage, it is necessary to decide
about the desiderata or the requirements for a good measure of central
tendency. A typical average should possess the following charac
teristics : 
(a) It shollld be rigidly defined. If an average is left to the estimation
of an observer and if it is not a definite and fixed value it cannot be
representative of a series. The bias of the investigator in such cases
would considerably affect the value of the average. If the average is
rigidly defined this instability in its value would be 110 more, and it
would always be a definite figure.
(b) It shollld be based on all the observations of the series. If some
of the items of the series are not taken into account in its calculation
the average cannot be said to be a representative one. As we shall
see later on there are some averages which do not take into account
MEASURES 011 CENTRAL T'ENDENCY 99
all the values of a group and to this extent they are not satisfactory
averages.
(e') )t should be e'apable o/further algebraie' treatment. lrfiytilverage
does not possess this quality, its use is bound to be very limited. It
will not be possible to calculate, say, the combined average of two or
more series from their individual averages; further it will not be possible
to study the average relationships of various parts of a variable, if it is
expressed as the sum of two or more variables. Many other similar
studies would not be possible if the average is not capable of further
algebraic treatment. 
(d) It .rhotJ/d be ea.ry to e'aleu/ate and .rimp!e fo follow. If the calcu
lation of the average involves tedious mathematical processes it Will
not be readily understood and its use will be confined only to a limited
number of persons. It can never be a popular average. As such,
one of the qualities of a good average is that it should not be too abstract
or mathematical and there should be no difficulty in its calculation.
Further, the properties of the average should be such that they can be
easily understood by persons of ordinary intelligence.
(e) If should not be affected by jlue'ftlatiblls of samplilzy,. If two
independent sample studies are made in any particular field, the averages
thus obtained, should not materially differ from each other. No doubt,
when two separate enquiries a~".made, there is bgund to be a difference
in the average values calculated but in some cases this difference would
be great while in others comparatively less. Those averages in which
this difference, which is technically called "fluctuation of sampling"
is less, are considered better than those in which its difference is,
more.
One more thing to be remembered about averages .is that tbe itellu
lIIM.re average ir being cakulated rhollld form a oomogli1le01lS group. It is absurd
to talk about the average of a man's height and his weight. If the data
from which an average is being calculated at:e not homogeneous, mis
leading conclusions are likely to be drawn. To.find out the average
production of cotton cloth per mill, if big and small mills are not separat
ed, the average would be unrepresentative. SimiLirly, to study wage
level in cotton..mill industry of India, separate averages should be cal
culated for the male and female workers. Again, adult workers should
be separately,studied from the juvenile group. Thus We see that as far
as possible, the data from which an average is calculated should be a
homogeneous lot. Homogeneity can be achieved either by selecting
dnly like items or by dividing the heterogeneous data into a number
of homogeneous groups.
Measures of various orders
Statistical series may differ from each other in the following three
ways : 
1. They may differ in ~ values of th~ variable round which
most of the .items cluster. '
100 FUNDAlIENTALS OP STATISTICS
ARITHMETIC AVERAGE
1
If  7("1+1Il1 +1Il.+ ............... +11I0)
1 :zmo
or a  ~ or a  
Where " "
11=Arithmetic average; Ill" Values of the '\>ariablej 1: = Sum
mation or total; ,,Number of items.
The following example would illustrate this formula.
&alllpl, 1. Calculate the simple arithmetic average of the
following ltems :
Si%e of items
20 SO 72
28 53 74
34 54 75
39 59 78
42 64 79
SollltiOll. DiI'I# M,thod
Computation of aritbm.ctic_ aY~J;3ge
Size of items
(m)
20
28
34
39
42
50
53
54
59
64
72
74
75
78
7'9
'1'02 FUNDAMENTALS OP STATISTICS
• Proof. Supposing INI' fIIz, INa, etc., stand for the values of a
variable and d1,tdz, da, etc., for· the;r respective deviations from the
mean and if a stands for their arithmetic average and n for the number
of items.
Then,
IN.+fIIZ+INS+ •.. +INn
a ~ ~~~~~~~
n
IN1+fIIZ+INS+ ... +tJtn ~an
The number of items is equal to n.
:. If we subtract an times from each side of the equation we
get
But
(m1a)=d1, (msa)=d z, (INaa)=d. and so on.
:. d1+dl+da+ ... +dn =0
Or l)i==0
MEASURES OF CENTRAL TBND:l!NCY 103
Symbolically:
T,tix
a =x+ n
Where 
a =Actual arithmetic average; x=Assumed arithmetic average;
T.dx => The sum of the deviations from the assumed mean; n = Number
of items.
It should be remembered that the difference between the actual
arithmetic average and the assumed arithmetic average is equal to the
sum of the deviations from the assumed arithmetic average divided by
the number of items.
Symbolically ;
T,dx
a x = __
n
If we solve example No. 1 by this shortcut method it will give us
exactly the same answer as we got by the direct method. This alternative
method is illustrated below:
Calm/ation of arithmetic average
Shortcllt method
Deviation from an assumed
Size of items mean (50)
_ _ _ _ _ _.(m) (dx)
20 30
28 22
34 16
39 11
42 8
50 o
53 3
54 4
59 9
64 14
72 22
74 24
75 25
78 28
79 29
n = 15 }Jdx=+71
71
a 50+ ~ 50+4.73 54.73
Or 4<='P"f = Emf
n 1:.f
The following illustration would clarify the formula :
Example 2. The following table gives the number of children
born per (amily in 735 families. Calculate the average number of
children born per family.
a "" x + ( }:;~dx X i)
The following illustrations would clarify these tules :_
Example 3. The following data relate to sh:es of shoes sold
a store during a given week. Find the average size by the shortcut
method.
Computation of the overage nte of shoes
Size of shoes No. of pairs Size of shoes No. of pairs
4.5 1 8 95
5 2 8.5 82
5.5 4 9 75
6 5 ~5 44
6.5 15 10 25
7 30 1~5 15
7.5 60 11 4
So/Illion. Shor/&II/ Me/hoa No.1.
Height
No. of
Persons
Deviations
from the
avo mean (67)
I Step
deviation
Total
Deviations
(ill) The fourth characteristic laid down for an ideal average that
it should be easy to calculate and simple to follow, is also found in
arithmetic average. The calculation of the arithmetic average is simple
and it is very easily understandable. It does not require the arraying
of "data which is necessary in case of some other averages. In fact this
average is so well knQwn that to a common "man_Average means an
arithmetic average.
Thus the arithmetic average
(a) is simple to calculate,
(b)~ does not need arraying of data,
(e) is easy to under5tand
(v) The last characteristic of an ideal average that it:. should be
least affected by fluctuations of sampling is also present in arithmetic
average to a certain extent. If the number of items in a series is large,
the arithmetic average provides a good basis of comparison. as in such
cases, the abnormalities in one direction are set off against the abnorm
alities in another direction.
Drawbacks of arithmetic average
No doubt the arithmetic average satisfies most ,of the conditions
of an ideal average, there are certain drawbacks also from which it suffers
and as such it should be used with caution. These drawbacks really
arise on account of the peculiar nature of this average aqd the teChnique
of its calculation. The points worth consideration in this respect are
as follows:
(i) Since arithmetic average is calculated from. all the items of a
series sometimes the abnormal items may considerably affect this average,
particularly when the number of items is not large. For example,
if the income of a shopkeeper is Rs. 1,000 per month and the incomes of
his three assistants are Rs. 25, Rs. 35 and Rs. 40 per month respectively,
. . " 1000+25+35+40
the average Income of thIS group would be Rs. 4
or is 275 per month. This is not at all a representative figure. Simi
larly, if one player in cricket scores 300 runs and the remaining 10 players
score only 140 runs, the total is 440 runs and the average per player is
40 run~. It is not a representative figure as 10 players out of 11 have
scored on an average only 14 runs each.
(it) Further, the fact that the arithmetic average cannot be calcu
lated without all the items of a series can also be said to be a drawback,
If out of 1000 items the values of 999 items are known the arithmetic
average <;annot be calculated. Other averages like median and mode do
not need complete data,
(iit) Arithmetic average is no doubt easy to calculate but in Ii
relative sense its calculation may be more difficult than tha:t of mode or
median as they can be located merely by inspection.
(iv) Another point to be noted. in this connection is that the
arithmetic ayerage can be a figure which does not exist in tne series
MEASURES OF CENTRAL TENDENCY 113
at all. The arithmetic average of 12, 14 and 19 is 15. No items of the
series has a value of 15.
(II) Arithmetic aye rage sometimes gives such results which appear
almost absurd. If we have to find out the number of children per
family, and if we use th~ arithmeti~ average, it is qui!e likely that we ~et
the average as 3'4 "children. ObvlOusly the result 1S absurd. A chlld
cannot be divided in fractions.
(II') Sometimes arithmetic average gives fallacious conclusions.
Suppose the incomes of two groups of persons are as follows :
The average incoine of each of these two
groups is Rs. 300. It would appear from the A B
averages that both the groups are economically
at the same level, and the two series are al·
most similar to each other but this is not the 1000 325
case. The two series entirely differ from each 100 300
other. so far· as their composition is con 75 285
cerned. 25 290
(fIi;) The arithmetic average gives
greater importance to bigger items of a series
and lesser Importance to smaller items. It has 1200 1200
an upward bias. One big item among four ,
items, three of which are small, will push up the average conSIderably.
But the reverse is not true. If in a series of four items there are three
big items and one small item the average will not be pulled down very
much.
The above discussion thus leads us to the conclusion that though
arithmetic average fulfils most of the conditions of an ideal average yet
it should be used with caution as it is likely to give erroneous conclusions
under certain conditions.
,. MEDIAN
 ('MuI;an !!!Ae vtJlf!..,_gilh, 11I~ ilJ.!!l.d.. a ser;e.!..JI!/un iJ,4 arrayed ~'!. t!,mn£_
in&. ;;:r,{t:enmng Drlir DL!'IPblibl, .
It divides the series in two equal parts.
Tlie va uesof items in one part are less than the value of the median and
in the other part are more than it. If in a clas~ there are 21 students and
if they stand in a line in accordance with their height beginning with the
shortest amongst them and ending with the tallest, then the 11 th student
would be in the centre and would divide them in two parts consisting of
ten students each. Students of one part will have heIghts less than the
height of the 11th student and of the other part more than this height.
The height of the 11th student is the median height. For un grouped
data it may be convenient to und the value of the median by counting
+1 .Items, b"
N"2. eglnmng W'1t . h the h'19hest f\or Iowest).Item tn . th e
array. In grouped data it is abandoned.
Symbolically M .... si%e of ; items
where M stands fot the median and n for the number of items.
8
114 FUNDAM;NTALS OF STATISTICS
the 5.5th item. In such a case the values of 5 and 6 items would be
added and IN _total would be divided by 2: the resulting figure would ,
be the value of the median. The following example would clarify this
point : 
Exam.ple 8. The following table gives the marks obtained'by a
batch of 30 B. Com. students in a classtest in statistics. (Marks 100).
Roll. No. Mark;s obtained Roll No. Marks obtaIned
1 33 16 ~4
2 32 17 33
3 55 18 42
.4 47 19 38
5 21 20 45
6 SO 21 26
7 27 22 33
8 12 23 44
9 68 24 48
10 49 2S 52
11 40 26 30
12 17 27 58
13 44 28 37
14 48 29 38
15 62 30 35
MEASURES OF CENTRAL TENDENCY 115
Find the value of the median.
S()ffltion. Marks obtained by 30 students arranged in ascending
order of magnitude:
Serial No. Marks Serial No. Marks Serial No. Marks
1 12 11 33 21 47
2 17 12 35 22 48
3 21 13 37 23 48
4 24 14 38 24 49
5 26 15 38 25 50
6 27 16 40 26 52
7 30 17 42 27 55
8 32 18 44 28 58
9 33 19 44 29 62
10 33 20 45 30~ 68
M == SIZe z
. 0 f n+1 items
.
d · ... size
:?Jf M elan z
. of n+l pairs;
. wnere
/ n equaIs the total f requency
.
= SIZe 0
457 + 1 or 229'
f 2 palrs  8.5
I
It will be clear from the above figures th,t th~ .alue of items from
213th to 294th is 8.5. The 'Value of the 229th item. thus, is also 8.5.
Detetmination of median in a continuous s¢es
When the median of a continuous fre~ncy distribution has to
be determined there is one difficulty. The tie of the median lies in
a class interval, and to get a definite fi~ure, interpolation has to be done.
Suppose, for example it is'found that the :value of the median lies in the
20 to 30 class interval'whose frequency is 40. Now to find out the value
of the median "We have to takic recourse to interpolation and to apply a
?articular formula. This formula, which we discuss below, is based on
the asswnption that the frequencies of the class in which the median lies
Lre uniformly spread over the whole classinterval. In the abqve case.
He shall presume that these 40 units are equally distributed in, the whole,
:lass interval of 20 to 30 or each of these ten values 20, 21, 22 ang so on,
las a frequency of 4 units. /
The formula of interpolation to find out the median is : 
at the middle item cutting the curve at a particular point. The value of
the median is read on the vertical line (called fNdinate) at the point of
intersectio~. This procedure would be illustrated in the chapter on
Graphs.
Merits of median
(i) It satisfies the first condition laid down in previous pages for
an ideal average as it is rigidly defined.
(ii) It can be easily calculate'd and it is understood without any
difficult~.
(iii) It is not affected by the values of the extreme items and as
such is sometimes more representative than arithmetic average. If the
incomes of five persons are Rs. 30, Rs. 35, Rs. 40, Rs. 45 and Rs. 1,000
the median would be Rs. 40 whereas the arithmetic average would be
Rs. 230. Median in such cases is a better average.
(iv) Even if the value of the extremes is not known median can
be calculated if the number of items is known.
(v) It can be located merely by inspection in many cases.
(vi) It gives best results in a study of those phenomena which are
incapable of direct quantitative measurement, for example intelligence..
It is impossible to measure intelligence quantitatively but it is possible to
arrange a group of persons in ascending or descending order of intelligence
and thus to locate a person ;vhose intelligence can be:. said to be average.
Drawbacks of median
(i) Median may not be representative of a series iQ. many cases.
This is specially so when there are wide variatiQns between the values
of different items: For example, if the marks obtained by eleven students
are respectively 15, 16, 16, 18, 18, 20, 54, 60, 60, 60, and 72 the median
marks would be 20. Clearly the average is not representative of the series.
(ii) It is not suitable for further algebraic treatment. For exam
ple, we cannot find out the total values of the items if we know their
number, and median.
(iii) When median has to be calculated in continuous series it
requires interpolation. The assumption of the interpolation, that all
the frequencies of the classinterval are uniformly spread over their
values in the classinterval, may not be actually true. In most cases it will
not be true.
(iv) If big or small items in a series are to receive greater impor
tance median would be an unsuitable average. Median ignores the
values of extreme itenls.
(v) Median is more likely to be affected by the fluctuations of samp
ling than the arithmetic average.
(vi) The arrangement of items in ascending or descending order
is sometimes very tedious.
Comparison of mean and median
Both the mean and the median satisfy the conditions of rigld
definition and stability but so far as ease in calculation is concerrred
MEASURBS OF CEN'tRAL TBNDENrV 119
median has >l distinct advantage over mean. On the other hand, the
general fluctuations of sampling 'affect the median to a greater extent
than the mean, though there might be some cases where mean is affected
to a greater extent by such fluctuations than the median.
So far as thl'! case of algebraic treatment of these two averages is
concerned, mean is definitely superior to median. In case of mean
w hen several series relating to one phenomenon are combined into one,
it is possible to find out the combined average from the averages of
various series and their number of observations. It is not possible in
case of median. However, if the component series are symmetrkaP
their means and medians would be identical and as such combined mean
and median would also be the same. But in case of asymmetrical distri
bution the combined median would not coincide with the mean n01" with
any other assignable value. The sum or difference of the corresponding
values of the items of two series, is not equal to the sum or difference of
their medians as is the case with arithmetic average. The calculated value
of the median subject to error, is not necessarily the same as the true value
of the median, even if the error is :tero. that is if positive or negativ:e
errors cancel each other.
On the other hand, median has certain advantages over tue mean.
It is easily calculated and is readily obtained without even knowing
the value of all the items, provided they can be arrayed. Further in
SOme cases mean cannot be calculated due to the extreme class intervals
being infinite, like cCless than 100" or "more than 10,000" etc; but median
can be easily obtained in such distributions. Sometimes median may be
more representative than the arithmetic average, due to the fact that it is
not affected by the values of extreme item:::. If, for example, the values
of most of the items of a sample cluster round 200, median would not be
affected if suddenly, one it~m, whose value is 3000, is included in the
sample.' Mean in such cases is more affected by fluctuations of sampling
thhl the median. Further, median is geO(:rally the value of a particular
item of the series, whereas mean may not be the value of any item of the
series. In this sense median is a more natural average than the mean.
QUARTILES, DECILES AND PERCENTILES
It has been seen that the median divides an arrayed series in t'wo
equal parts. The values of items in on'e part are more than the median
value, and the vlllue of items in the other part, less than the value of the
median. With a view to have a better study about' the composition of
a series it may be necessary to divide it in four, five, six, seven, eight,
nine, ten or hundred parts. Usually the series are divided either in
four, ten or hundred parts. Just as one item divides the series in two
parts, three items would divide it in four parts, nine items in ten parts and
ninetynine items in hundred parts. The values of these items are res
pectively known as Quartiles, Deciles and Percentiles. A series can be
di~ided in five, seven or eight parts by Quintiles, Septiles and Octiles.
I For further explanation see chapters 0::1 Dispersion and Skewness.
120 'PtlNDAMl!NTALS OF STATIS'I'lCS
There are thus three quartiles, nine deciles and ninetynine percen
tiles in a series. The second quartile, qrth decile and 50th percentile is
median. The value of the item which divides the first half of It series
(with values less than median) i.ti two equal parts is called the First gM4rtil.
or LOlli" Quartil, and the value of the item which divides the latter
half of a series with values more than the median) in two equal parts is
called Third Q1IIIrIiJ, 0'Upp., QIIP,IiJ,. The S,fOlJd Qua,lih or the
MidtJ" Qlla,lil, is the same thing as median.
The calculation of 'Quartiles, ,Deciles, 'Percentiles and other such
values is done by following the same rules with which the value of median
is determined.
Thus
Ql  the v al ue of 4 " .ltems
. f II •
SO/Ii/iOIl. 1st Quartile or 12.1 s~e 0 4" Items
. f3(11) .
Upper Quartile =Slze 0 4 pairs
=9
· f 7(n) .
7th Decile =!i{ze 0 fOpalrs
· 0 £7(457)
=Slze lU or 3199h'
. t pair
=9
46th Percenrl1e · 0 £pallS
=SJZe 46(n) .
100
· £46(457) 2 02 .
=Slze 0 fOO" or 1. 2th pair
=8
3rd Quintile · f 3 (n) .
=Slze 0  5  palts
· 3(457) .
=SlZe o f  or 274.25th Ipalr
5
=8.5
· f 5(n) ".
5th Octile = size 0 8 pairs
.
=SlZ~ 0
f5(457)
8 or 2856h .
. t pal!
=8.5
'1
Where and 12 are the lower and upper limits of the class in which
the first quartile lies,/l the frequency of this class, '11 the quartile number
.!!._ and c the cumulative frequency of the class preceding the quartile
4
class.
',,/1 ,
Qa = I 1 + 11 ('l3 C)
Where 11 and 12 stand for the lower and upper limits of the class
in which the 3rd quartile lies, 11 for the frequency of this class, 'ia the
quartile number and & the cumulative frequency of the class preceding
the quartile class. .
Similarly the formulae can be denved for the calculation of deciles
percentiles, etc. '
Thus '.i1
'd )
D 2= I 1   2& an d
11
i.II \
P72 ... il +y; rp72 C}
Example 13. From the data given below calcula 'e the median and
quartiles.
Solution. Calculation of the median and quartile ages of married females.
Age Number of married Cumulatlye frequency.
females
~
05 3 3
510 31 34
1015 410 444
1520 1809 Q253
2025 2446 4699
2530 2223 6922
3035 1723 8645
3540 1292 9937
4045 963 10900
4550 762' 11662
5055 531 12193
5560 317 12510
6065 156 12666
6570 59 12725
7075 37 12762
Total 12,762
The median age of married fe~ales
th f th n females, where n equals the total
= e age a e 2 frequency
12762
... the age of the 2 _. i.e., 6381st married female .
l~ PUoNDAMBNTALS OP STATIS'l'ICS
who lies in the 2530 age group. Applying the formula of interpolation
1,/1 )
M I 1+,;(1111
where, M represents the median,/:t!.._nd I. the lower and the upper limits
of the group in which median is situated;!1 the frequency of median class;
111. the number of middle item or T items a"nd I. the cumulative
frequency of the group lower than the one ~ which median is situated.
3025
M25+ 2223 (6381  4699) 28.8· years approx.
The lower quartile age of married females
n
the age of the ~ i.i., 3190.50th married female who lies
in the 2025 age group;
By interpolation
I 1,/1 )
1211+ !1 (fl t ;
where 121 represents lowc:r quartile; 11 and It. the lower and the upper
limits of the group in w!Uch lower quartile is situated;!I' thf frequency of
"
lower quartile class; fl' the number of 4 .ltems; and t. the cumu
lative frequency of the group lower than the one in which the lower
quartile is situated.
=20+ 2520
2446 (31 90.50  2253) = 21.9 yrs. approx.
The upper quartile agc< of married females
the age of the 3 ~) 'i.,.• 9571.5 st married female who
i~ situated in the 3540 age group;
By interopolation,
n 1+ 1. 11
olGa"'" 1 Y;(f,t;
)
where Q a stands for upper quartile; 11 and I,. for the lower and the upper.
limits oithe groUp in which upper quartile is situated;!! for the frequency
of upper quartile class; fa for the number of ~) items; and 1
or the cumulative frequency of the. group lower than the one in which
Q. is situated.
 35 + 4035
1M2 (9571.58645) "",38.6 yrs. approx.
MEASURES OF CENTRAL TENDENCY 125
Bxampll 14. From the dda given in Example 10 calculate
(a) 8th decile and (b) 56th percentile.
S oilltion : (a) 8th duile
Da =si2!e 'of 8 i~) items. where n equals 245
... size of 196th item, 'which lies in 7  9 group; applying the for
mula of interpolation.
D.... 11 + !,,;:1 (ds  t).
here 11 and I" represent the lower and the upper limits of the group
in which 8th decile is situated.ft, the frequency of the same group; de, the
value of 8i~) item and t. the ~ulative frequency of the group
)ewer than the one in which 8th deCIle is situated.
97
We get Da=7+ sr (196144)
=8.6.
(b) 56th Pemntil,;
. f 56 (n) .
PH=s12le 0 100 items
. f 56(245) ..
.,. SIZe 0 100 stems
sae of 137.2th item, which lies in 57 group,
Applying the formula of interpolation,
ItII
P&I  I 1 + 11CP..  t);
where 11' II and!1 represent the lower and the upper limits and the fre
quency of the group in which the 56th p_ercentile is situated PH> the value
of ~~_(n)
100 item and t, the cumulative frequency of the group lower
lihan the one in which PH is situated
75
We get Pae =5+ ss(137.259)
6.84
MODB
Mode is the most comma" item of a series. It represents the most typical
of frequent value of a seriesa \talue which is in fact,the fashion(/a mode).
When one speaks of the "average student," "the most common wage."
"the common man" or "the typical farm" and the l!ke, he is unconsciously
referring to mode. If it is said that the most common wage in a particular
industry is Rs. 50 per month, what it means is that the largest number of
persons get this single figure of Rs. 50 as wage. Other I figures of wage
are not as popular as this one, and the number of persons getting them is
less than the number getting Rs. 50 per month. _
Methods of calt:tllation. It appears from this definition that it must
be very easy to calculate the mode of a series. In fact it is.., not always
so. As we shall see later on, the most satisfactory method of calculating
mode is that of "curve fitting" which is an extremely difficult process.
In ordinary practice, however, mode is estimated by easier methods which
are comparatively very much less accurate than the method of curve
fitting. These methods are no doubt very simple and easy.
Example 15
Find out the mode of the following series : 
SolNliofl
I ~3) I
item
(81) _(!L_/ (2) {4) 1 (5) (6)
5 48.
100 I
!
6 52 } II J~
108
7
8
56
60.
..
} 116
} u3
r I I
156
168 179
I t I
9 63
10 57 } lao I~
1
11 55 } 112 17' 162
} 105
I I
12 50
13 52 } 102 157
I
}
I
93 1.43
14 41 150
}
I I II~
98
15 57
16 63
} X20 161
172
17 52 } 115
I
} 100
18 48
19 40 } 88 140
The frequencies in colutntl (1) are first added in /tIIo' sib. columns (2)
and ,3). Then they are added in IDr,,'s in columns (4), (5) and (6). The
maxtmum frequency in each column is indicated by thick letters. It
will be observed that mode changes with the change in grouping. Thus
according to column (1) mode should be 9 or 16 according to column
(2) it should be either g or 10 or 15 or 16. To find out the point of.ma.xi
mwn concentration the data can be arranged in the shape of table as
follows:
129
Analysis Table
Columns Sh!!e of item containing ma:lCimum frequency
 9 16
(1)
(2) 9 19 15 16
(3) 8 9
(4) 8 9 10
(5) 9 10 11
(6} 7 8 9
No. of timesa size 1 3 6 3 1 l' 2
occurs , ! I
Since the size 9 occurs the largest number of times it is the modal
size or mode is 9.
If we look; at the frequencies in the o~iginal t.able, we shall fin.d
that the frequency of 63, which is the ma:lC1mum smgle frequency, IS
against two values, 9 and 16. The series thus appears to be hi modal
but the process of grouping leads us to the conclusion that the con
centration of items round 9 is more than the concentration round 16.
Even if the frequency against 16 was 64 instead of 63 probably group
ing would have disclosed that the concentration/of items round about
9 is plore, even though the individual frequency again!>t 9 is only 63 It
is thus never safe to rely only on the inspection of a series and to locate
the mode at the point of maximum frequency. Mode is affected by the
frequencies of the neighbouring items also, and, therefore, grouping is
essential, as it reveals the true point of ma:lCimum concentration.
Determination of mode in a continuous series
In a continuous senes the determination of mode involves two
steps. First, by the process of grouping, the class in which there is
maximum concentration has to be located. After this the value of
mode is interpolated by the use of a formula. It should be remember
ed that mode does not always give satisfactory results in a continuous
series. If the size of the classinterval is changed the modal class also
changes in many cases. Suppose, for example, the magnitude of c1ass
intervals is 10 and mode hes in, say, 3040 group. If this series is
regrouped in classintervals having magnitude of only 5, it is quite likely
that the mode may lie in, say, 4550 group. It would depend on the
distributior. of items in various class intervals. For determining mode
in 2. continuous series, the classintervals should not be very big in size,
but if the size of the classintervals is very small the frequencies also
become very small, the distribution becomes irregular and the deter
mination of mode becomes very difficult. The series n:ay even become
multimodal.
It has already been said that the mode is affected by the frequen
cies of the neighbouring classes. The formulae for the interpretation of
mode are based on this very assumption. If the frequency of the
9
130 FUNDAMEN'rALS OF STATISTICS
class or by deducting 10~il X(/I/1) from the upper limit .of the
modal class. Thus if Z stands for the mode,
The two sets of formulae given above would give different values
of mode as they are based on different assumptions. In the first case
we take into account only the frequencies of the preceding and suc
ceeding classes whereas in the second case (i) difference of the modal
frequency and the preceding frequency, and (ll) the difference of the
modal frequency and the succeeding frequency, are taken into accou ..<.
The second set of formulae ~re supposed to be better than the
first set and usually mode is interpolated by starting with the lower
limit. As such we shall be making use of the following formula in
the determination of mode in a continuous series.
"*'
v/
Z I1 + 2/11/10 12
1 0
I
(2 1
I)
Example 16. The following tahle gives the length of life of 150
electric lamps : 
Life (hours) Frequency of lamps
a to 400 4
400 to 800 12
800 to 1200 40
1200 to 1600 41
1600 to 2000 27
2000 to 2400 13
2400 to 2800 9
2800 ~o 3200 4
Calculate the mode.
Soln/ion. Determination of mode by grouping
Life (hours) I Frequency of lamps
(1)
, (2) I (3) I (4) I (5) (6)
0 400
I
I 4
.
\.16
I
400 800 12 ) ")
~52
8001200 40 J 56
I
18]
1 f93
12001600 4:1: J
168 I!oa
16002000 27 J
20002400 13
140
J 'la,
I·' f~
}22 I
\49
24002800
28003200
9
4
113
J ! I i
132 FUNDAMENTALS OF STATISTICS
Z=/ + ~1 £011 (/
1 2 /1 )L?ttfl?)
Where Z stands for mode, 11 and 12 .stand for the lower and upper
limits of the modal group, 11 stands for frequency of the modal group,
fo stands for frequency of the group preceding the modal group,f2 stands
for frequency in the group succeeding the modal group.
4140
We get, Z = 1~+ 824027 X ~O
=1226.67 hours
Thus
The modal life of the lamp = 1226.67 hours.
Detel:.tllination of mode by curve fitting
As has been said earlier, the above methods of the calculation
of mode are unsatisfactory. In most of the distributions, as they arise
in actual practice, these methods would not give satisfactory results.
The ideal method of calculating the mode is that of curve :litting. Since
there are many irregularities in the data which we normally come across,
it is necessary to remove them befo~e determination of mode. These
irregularities are removed by the technique of curve fitting. Attempts
jlre made to :lit an ideal curve which gives the closest possible :lit to the
actual distribution. The value of the variable corresponding to the
maximum of this ideal curve is the value of the modt'. The technique
of curve fitting is highly mathematical and should be left to the more
advanced students of this subject.
Determination of mode from mean and median
In a symmetrical distribution the mean, median and mode are
identical. We shall discuss in the next chapter the concept of 'a sym
metrical distribution which gives a normal curve. In actual practice,
however, symmetrical distributions are very rare, and data usually give
a symmetrical curve. In distributions which moderately differ from
MEASURES OF CENTRAL 'l'E;NDENC'Y 133
Dtawbaoks of mode
Mode is an unsatisfactory average and has many drawbackt.
)ome of them are as follows:
(,) Mode is illdefined, indeterminate lind indefinite. The veCj
Ist condition laid down for an ideal average that it should be rigidly
efined is not fu11illed by it.
(ii) Mode is not based on all the observations of a series and as
lch the second condition is also not fulfilled by it.
uti) Mode is not capable of further mathematical treatment.
(iv) Mode may be unrepresentative in many cases. If in a series
1000 items 20 have a particular value and other values have frequencies
is than 20, it does not necessarily mean that the value whose frequency
20 is the typical or average value. In such cases data should be
IOverted into class intervals of a bigger magnitude.
(u) In many cases it may be impossible to set a definite value of
.ode. There may be 2, 3 or more modal values.
omparison of mode with mean and median
From' the above discussion, about the merits and drawbacks of
lean, median and mode it is qbvious that mode dbes not stand in
)mparison either to mean or median. Mode no doubt possesses the
lerit of being the most popular item 'of a series and has also the
ivantage of easy calculation and common understandability, yet its
rawbacks are too many to be set' off against these merits. Mean is
.mple in calculation, its value is definite and can be easily determined.
t is amenable to algebraic treatment and is usually not affected much
y fluctuations of sampling. Median is more ea,ily calculated than
ven mean, and in certain cases it is as stable as mean, but if v'ariations
it the values of items .are not uniform, median is indeterminate, and .is
lmost incapable of algebraic treatment. Mode is hardly suitable for
[lost of the elementary studies as it is correctly determined only by
urvefitting which is an extremely difficult process. It is unrepresen
ative in many cases, and is not based on all the observations of a series.
rhus, of these tlvee averages, mean has definite advantages over median
.nd mode, though there may be some cases where median or mode
nay have preference over mean. Mode has its own importance and
t JIlay be the reason for giving its value along with mean but it should
)e clearly understood that mode cannot replace mean and for that
natter neither can median do so. However, it should not be ta~en
:0 mean that median and mode are superficial averages and have no
independent virtues. There are certain fields in which Il".t!dian or
mode may give better result than the mean, but sllch cases are few
and the universality of mean cannot be challenged on account of these
~ases. We shall discuss more about this point in a later section after
we have examined the other averages also.
MBASURBS OF CBNTRAL TENDENCY 135
GEOMETRIC MEA.~
Geometric mean is the nth root of the product 9'fn items of a series.
Thus if the geometric mean of 3. 6 and P Ie. ~o be calculated it would
be equal to the cube root of the product of these figures. Similarly
the geometric mean of 8, 9, 12 and 16 would be the 4th root of the
product of these four figures.
Symbolically g=D'¢mlXmsXHlaX ... mn
where g stands for the geometric mean, n for the number of items and
m for the values of the variable.
The calculation of the geometric mean by this process is possible
only if the number of items is very few. If the number of items is
large and their si2:e is big, this method is more or less out of question.
In such cases calculations have to be done with the help of logs. In
terms of logs.
1 _log.rml+1og.ms+log.ms+ .. .log mn
og.g_ 11
or
g A nti1og. {
log.III1+10g.Hl2+1nog.Hls+ .. .1og. Hln }
or
. Mean=
Geomc;tnc '.' Anti
. 1og. [ assume'd 1C?g. + ~ Deviations]
II
g= A n ti'..Iog. 2.1208
 8  = A nt!. 1og ..265
1.841
16+6.3512
Series B. g= Antilog. 8 =AntiIog.2.7938
.06220
Al~ebraic properties of geometric 'lllean
Geometric mean possesses certain mathematical properties and they
are as follows : 
(i) Just as in case of arithmetic average the sum of the items
remains unchanged if each item is replaced by the arithmetic average,
similarly in case of geometric mean the product of the items remains
unchanged if each item is replaced by the geometric mean. Thus the
total of 2, 4. and 8 is 14 and the arithmetic average is__!3~ If in place
of these figures. we substitute the arithmetic average the total would
still remain 14.. Similarly in caSe of geometric mean the product of
these three figUres 2. 4 and 8 is 64 and the geometric mean is 4. If in
place of these numbers the geometric mean is written the product would
still remain 64.
(;1) On account of the above property of the geometric mean, it
is possible to calculate the combined geometric mean of two or more
senes if only their geometric meanS and the number ofjtems are known.
138 PUNDAMENTALS 01" STATISTICS
. Iog. [10.8681']
""anti . 1 .2. 1736
5 =antt~,og
149
If we calculate geometric mean of the five items together we shall
get this very figure. It can be yerHied from the answer ot example. No.
17 in which the geometric mean of these five items has been calculated.
(iii) Just as in the case of arithmetic average, sum of the deviations
from the mea:' on either side is always equal, similarly in case of geo
metric mean the product of the corresponding ratios on either side
is always equal. If the ratios of the geometric mean to the figures
which are equal to less than it, are multiplied together, this product
would be equal to the product of the ratios of figures more than the
geometric mean.
Thus the geo~etric m.ean of 3, 6, 8 .and 9 is equal to 6. The
product of the ratlos of ltems. equal to it or less than it would be
equal to the product of the ratios of items more than it.
t, g 8 9
Thus 3 X 6" "" g X g
or
6 6 8 9
'3 X'6=6 X 6
This p,:,cperty of the geometric mean is very important. It
indicates that geometric mean measures relative changes. If the price
MEASURES op CENTRAL 'TENDENCY 139
r= n /) Pn __ 1
." Po
Thus if Rs. 1,000 at compound interest become Rs. 1,500 at the end
of 10 years there has been an increase of 50% and the simple tate of
interest is 5%. The compound rate would be
r =10 J~~~~  1
=10'\1"1.5 1 =1.0411
=.041 or 4.1%
Whenever we have to find out the average of the rates of increase
_gr decrease, ~uch problems arise. If we calculat~ the mean of the
rates of increase or decrease the study would be Inaccurate as mean
measures absolute changes but if the geometric mean of the rates of in
crease or decrease is calculated the results would be accurate, as geo
metric mean measures relative changes. ,
Merits of geometric mean
Besides the abovementioned mathematical properties the geometric
mean has many other merits. We shall now examine the worth of
this average by finding out how many conditions Qf an ideal average
(laid down earlier) does it satisfy.
(i) The geometric mean is rigidly defined and its value is a precis~
figure.
(ii) It is based on all the observations of a series. Like arith
metic average it cannot be calculated, if even a single value of a series
is missing.
(iii) It is capable of further algebraic treatment. As we have
seen above, various types of mathematical relationships can be establish
ed between data when a relative study is being made with the help of
geometric mean.
{io) Ge~etric mean is. not much affe~ted by the ftuct?ations of
sampling. It .,.ves comparatIvely more weIght to smaller Items. In
this respect it is better than the arithmetic average and a single big figure
does not push its value very much.
ThuS out of five conditions laid down for an ideal average geo
metric. meal' satisfies four.
MEASURES OF CENTRAL TENDENCY 141
1,450 .00069
7,200 .00014
120 .00833
1060 .00094
150 .00667
480 .00208
360 .00278
96 .01042
200 .00500
520 .00192
60 .01667
~·m~048
SO/lition :
Calclllation of quadratic mean
Size of items Square of the size
(m) (ml)
10 100
30 900
40 1600
50 2500
70 4900
n=5 10000
1000~
Qm= j 5
=44.72
MEASURES OF .cENTRAL TENDENCY 145
The arithmetic average of the series would have been 40. Quad
ratic mean is seldom used as an average except in case of finding out
the average of the positive and the negative deviations from a measure
of central tendency. In that case it is known as standard deviation: We
shall discuss it in the next chapter.
Moving average
Moving average is calculated by using the technique of simple
arithmetic average. It is useful in removing the irregularity of time.
series and is usually calculated to study the long period trend. The first
thing to be decided in the calculation of moving average is the "period"
for which the average is to be calculated. The moving average may
be threeyearly, fiveyearly or sevenyearly depending on the nature of
the series. We shall discuss this problem of periodicity of moving
average later in the chapter on Analysis of Time Series. For the present
we shall simply illustrate the technique of its calculation.
If a three yearly moving average is to be calculated the arithmetic
average of the first three years' figures would be found out and written
against the middle year (second year in this case). Then the'first year's
figure would be dropped and the aritbmetic average of second, third
and fourth years' figures would be calculated and written against the
third year. Similarly the arithmetic average of the figures of third,
fourth and fifth years would be written against the fourth year and so
on. The following example would illustrate the method of its cal
culation.
ExatlJple 22. Calculate the three yearly moving average of the
following figures relating to the annual sales of a concern (in lakhs of
rupees).
Calculation of tbree yearly moving average
Year Sales (in lakhs 3Yearly moving 3Yearly mov
of rupees) Total ing average

1945 8 ... ...
1946 9 25 8.3
1947 8 24 8.0
1948 7 23 7.7
1949 8 24 8.0
1950 9 27 9.0
1951 10 30 10.0
1952 11 32 10.7
1953 11 34 11.3
1954 12 33 11.0
1955 10 ... ...
Similarly, if a 'five yearly moving average Has to be calculaJed
the first five figures (of years 1945 to 1949) would be added 31'd their
10
146 FUNDAMENTALS OF STATISTICS
average would be written against the third year or 1947, then the next
five figures leaving the first (of years 1946 to 1950) would be averaged
and the figures written against the middle year of 1948 and so on.
Moving a~erage is very helpful in removing the fluctuations of
time series and giving an idea about the general trend.
Progressive average
It is also calculated by the help of simple arithmetic average. It
is a cumulative average and is different from the moving average. In
the calculation of this average, figures of all previous years are a,dded
and no figure is left out as in the case of moving average, Thus the
progressive average of the second year would be equal to the arithmetic
average of the figures of the first two years; the progressive average
of the third year would be equal to the arithmetic average of the figures
of the first three years and so on. .
The following illustration would clarify the procedure :
Example 23. C~culate the progressive average of the data given
in Example 22:
Ca/(ulation oj progressive average
I
Years Sale (in lakhs of ProgressIve Progressive
rupees) total average
1945 8 8 8.0
1946 9 17 8.5
1947 8 25 8,3
1948 7 ~2 8.0
1949 8 40 8.0
1950 9 49 8.1
1951 10 59 8.4
1952 11 \, 70 8,7
1953 11 81 9,0
1954 12 93 9.3
1955 10 103 9.3
Pr<;>gressiv~ average is used by businesshouses particularly in early
years wIth a VIew to compare the current profits with those of
the past,
Relation between different averages
When different averages have been calculated from a given set of
observations it will be found that there is a relationship between their
values, Generally these relationships are of the following type : 
(i) If a series is «normal" or ".ryll/metrical" the values of its mean
median and mode would be identical.
M~SURES 00F CENTRAL TENDENCY 147
there are wide variations in any series median is the most unsuitable
average. Similarly, if the enquiry under question relates to, say,
"average size of ~eadymade clothes" or "size of typical farms," the
average to be used is mode. The use of mode is every day increasing
in business and commerce, Modal output per machine or modal time
needed to produce a commodity are very important concepts in the
business world of today. But mode i~ very often indeterminate and
unrepresentative and is entirely unsuitable for many enquiries. It is
not capable of further algebraic treatment and has limited use. If an
e~quiry is being conducted to study the relative changes in the price
level at two periods, neither arithmetic average nor median or mode
would give satisfactory results. In such cases the best average is the
geometric mean. In the construction of index numbers the use of
geometric mean is almost' universal. But geometric mean is entirely
useless if bigger items have to be given more weight or if a study of
absolute, rather than relative changes, is undertaken. Harmonic mean
similarly is the best average if small items have to be given more weights
or if we have to find out the average of certain types of rates, etc. If,
for example. we have to calculate the average speed of a person who
walks four miles per hour, for the first mile a~d three miles an hour,
for the second mile, arithmetic average would give inaccurate results.
Harmonic mean of these figures which would be s._,4, . is the correct
average. This person takes fifteen minutes to cover the first mile and
twenty to cover the second mile or in thirtyfive minutes he covers
two miles. The speed is '.,~ miles per hour.
The above discussion clearly shows that each type of average has
its own field of importance an.d usefulness. Before selecting an average
ail these considerations should be kept in mind. In actual practice
tWo or three averages of a series may be necessary for a proper under
standing of its special 'features. A discriminate use of averages is
assential for souna statistical analysis. But all said and done, it has to be
edmitted that arithmetic average would be found to be ideal average
for a larger number of enquiries, than any other average.
Limitations of averages
Even when an average .has been selected very judiciously and is
ideal for a particular investigatioo, it should never be forgotten that
even the best average has its own limitations. An average is a single
figure representing a series, and no single figure can condense in itself
all the properties of the items which it represents. This is the reason
why conclusions which are drawn on the basis of a study of averages
are not always infallible: The average height of women may be less
than the average height of men but it does not mean that no woman
can be taller than a man. The wellknown example of the mathema
tician who calculated the average depth of a stream and finding it lower
than the average height of his family members, attempted to cross it anct
drowned with his family in the process, is an illustration on this point.)
MEASURES OF CENTRAL TENDENCY 149
The average depth of the river may have been lower "than the height
of the shortest member of the mathematician's family, but at some
point the depth of the stream must have been more than the height of
the tallest member in the group.
Average is a single ~gure and can be expected to represent a series
only as best as a single figure can. Averages do not throw light on the
formation of a series or distribution of frequencies round the various
values of a variable. It is for this reason that measures of dispersion
and skewness are calculated. Averages do not reveal the whole story of
a series. A student getting 30, 40 and 50 markis respectively, in three
examinations would have the same average as another who gets 50, 40
and 30 marks respectively. The progress of the two students is in
different directions but on the basis of the averages' they will be ranked
together.
In fact if wrong conclusions are drawn by the use of judiciously
selected averages, it is not the fault of the averages. The fault lies
with the person drawing the conclusions. The inherent limitations of
averages should always be kept in mind and they should not be expected
to reveal more than what they can.
WEIGHTED AVERAGE
Need and meaning. In the calculation of simple. average each
item of the series is considered equally important but there may be
cases where all items may not have equal importance, and some of
them may be comparatively more important than others. The funda
mental purpose of finding out an average is that it shall "fairly" re
present, so far as a single figure can, the central tendency of the many
varying figures from which it has been calculated. This being so,
it is necessary that if some items of a series are more important than
others, this fact should not be overlooked alt<;>gether in the calculation
of an average. If we have to find out the average income of the
employees of a certain mill and if we simply add the figures of the
income of the manager, an accountant, a clerk, a labourer and a watch
man and divide the total by five", the average so obtained cannot be
a fair representative of the income of these people. The reason is that
in a mill there may be one manager, two accountants, six c~erks, one
thousand labourers and one dozen watchmen, and if it is so, the rela,...
tive importance of the figures of their income is not the same. Similady
if we are finding out the change in the cost of living of a certain group
of people and if we merely find the simple arithmetic average of the
prices of the commodities consumed by them, the average would 'be
unrepresentative. All the items of consumption are not equally.im'por
tanto The price of salt may increase by 500% but this wiP not'affect
the cost of living to the extent to which it would be affected, if the
price of wheat goes up only by 50%. In such cases if an average has
to maintain it,) representative character, it should take into account
the relative importance of the different items from which it is being
calcl,1lated. The simple average gives equal importance to all the
items of a series. In this sense a simple average is also a wei
150 FUNDAMENTALS OF STATISTICS
average, because 'in a simple average the relative importance of all the
items is supposed to be the same. But in actual practice the impor
tance of various items is not always the same and in such cases the
simple arithmetic average and the weighted arithmetic ~verage would
differ in value. Therefore, in order that an average may be a typical or
a representative average, it is necessary that the relative importance
of items is taken into account in its calculation. Thus if item A is
consideredto be five times as important as item B, the weights of these
items respectively should be 5 and 1. Weights are. figures which indicate
the relative importance of variolis items.
Difficulties in weighting. It is easy to say that in many cases it is
better to take into account the relative importance of items and to have
a weighted average, rather than simple average~ but it is very difficult
to decide the relative importance of different items. If we have to
decide the relative importance of items, the problem that would arise
would be about the basis or criteria of determining the relative impor
tance. How should weights be assigned, is a question very difficult to
answer. In fact no hard and fast rule can be laid down for the assign
ment of weights, as the relative importance of items depends on ~he
nature and purpose of the investigation. In some cases the weights
are determined without much difficulty, and such cases are those where
weights are determined on the basis of some evidences associated with
given data. If we have to decide the weights of the income figures of
a manager, an accountant, a clerk, a labourer and a watchman, the
simplest method would be to give them weights in accordance with
their number. Thus if there is one manager, two accountants, six
clerks, one thousand labourers and twelve watchmen, the weights
would also be these very figures respectively. 1'6 calculate the average
income of these people if instead of finding out the simple arithmetic
average of the figures of their incomes, we multiply their incomes by
their numbers (weights), and if the total of these products is divided
by the total of weights, we shall get the weighted arithmetic average
of the series. This average' would be a better representative of the
series than the simple arithmetic average. Many writers like Secrist
and Kelly are of opinion, and Eightly too, that this is not a weighted
r
average. When values ar multiplied by their frequencies and the
sum of their product~ is dIvided by the total of their frequencies, it
is in fact a simple arithmetic average of the series. In cases of discrete
and continuous series we have already seen that arithmetic average
is c:Vculated by multiplying the values by their respective frequencies.
Sueli writers are of the opinion that weights should be determined
by some such evidence, which is not associated with the items them
selves. But it is neither easy nor safe to associate weights to various
items arbitrarily, as in such cases weighted average may give misleading
conclusions. Weights have to be judiciously selected.
In fact difficulties in'the selection of p!Oper weights are so many.
that many writers are of opinion that it is better to have simple average
than to have weighted average of doubtful fairness. Thus Bowley says:
\
)
MEASURES OF CENTRAL TENlJENcY 151
Where ~ mw stands for the sum of the products of the values and their
respective weights, and ~ w for the sum of the weights. The following
illustration would clarify the formula : 
Calculation of the weighted arithmetic average : direct method
Example 24. Calculate the weighted arithmetic average of the
prI~e of tea, from the following data assuming the quantities sold as
welghts:
Price per pound Quantities sold
(Rs.) (pounds)
2.25 14
2,:"0 11
2.75 9
3.00 6
152 FUNDAMENTALS OF STATISTICS
error in the size of ilcou. The reason for it is that the errors in weights
are u<ually unbiased and compensate each other while errors in the
values of items are generally biased ones. It is for this reason that
we had concluded above that attempt should be made to make the
items free from bias and we should not strain after exactness in
weights. According to King, "The items should be as exact as pos
sible and the weights used should be approximately accurate ...... ".
Shortcut method of calculating weighted arithlnetic average
The method discussed above for the calcluation of weighted
arithmetic average is sometimes found to be very tedious particulatly
when the size of items is big. In such cases a shortcut method can
be used. In this method, first an average is assumed and the deviations
of each item from the assumed average are multiplied by the respective
weights of the items. The sum of these !:'roducts is then divided by
the total of weights and added to the assumed average. The result in
figure is the actual weighted arithmetic average of the series.
.+
1: d'1JI
· 11y a ' =x Yw
Symb 0 1lea
Where a' stands for the weighted arithmetic average x' for the
assumed average 1: d'1)) for the sum of the products of the deviations
and the respective weights of items, and 1:D' fOr the total of the weights.
The following example would illustrate the formula : 
Example 25
From the following table calculate weighted average price of tea.
Price per lb. Lbs. sold
Rs. p.
1 00 200
1 35 275
1 62 400
1 75 150
2 00 100
2 ~ ~
2 50 50
SOilltioll. Caiculaiioll oj the weighted averag~ price oj a lb. oj tea
Deviations from
Price in Lbs. sold assumed weighted Total devia
paisas per lb. average (175) tions
(m) (w) (d') d'w
100 zuu 75 15,000
135 275 40  9,625
162 400 13 15,200
175 150 0 U
200 100 +25 + 2,500
225 75 +50 + 3,750
250 50 +75 + 3,750
~wl,247 1 1:m=1250· I 1:d' UJ = 19,825
154 FUNDAMENTALS OF STATISTICS
, '+ ~ (J'w)
a =X
~ (w)
where a' stands for the weighted average; x' for assumed weighted
ayerage: w, for weight and Jr for deviation from assumed weighted
average .
• We get, .a' = 175 +  ~~58~5 = 175 15.86 = 159.14 paisa
(a) When the importance oj all the items in a series is not equal. We
have seen that simple arithmetic average gives equal importance to all
the items of a series. In many cases all the items may not be of equal
importance. If it is so, a simple arithmetic average would give us
misleading conclusions. The following example would clarify the
point : 
~~
....
I~~ '"81.,)
Subject /weight Marks Marks Marks ...c: .. :e~ :ern
A B C .....bO~... bOlo<
..... c.s bO~
..... C<!
Statistics 4 63 60 65
a·
e3252 ~a
240 260
e3 S
Mathematics 3 65 64 70 195 192 210
Economics 2 58 56 63 116 112 126
Hindi 1 70 80 52 70 I 80 52
~

Total 10 256 260 250 633 624 648
(d) When ratio.r, percentage.r or ratn are hfin.g averag.a. Suppose the
heights of four groups' of persons are measured and it i~ found that
Scy, of the persons in group A, 10% in group B, 8% in group C, and
4~Z in group D hav"C heights less than 50" and it is required to find
mit the percentage' of people in all the groups combin("d together
:'Whose heights would be less than 50'. Simple arithmetic average of
these percentages would give a misleading conclusion. The reason is
that we do not know the number of persons in each group. In such
cases we should presume certain numbers in each group, and then on
that basis calculate the weighted arithmetic average, which gives the
correct results. If suppose the number of persons in these groups
were respectively 50, 70, 75 and 55 the weighted arithmetic average
can be ca1culated by taking these frequencies as weights of the various
percentages.
The percentage ratio of people with heights less than 50' (in all
the groups combined together) wo~ld be : 
(5 X 50)+(10 X 70)+(8 X 75)+(4x 55)
50+ 70+ 75+55  
or
250+700+600+220
250
or
1770
250 or 7.08%
158 FUNDAMENTALS OF STATISTICS
(e) Whm it is desired 10 caleulate the average of series from the average
oj its component parts. We have already discussed in the section on
simple arithmetic average, how the means of two or more compo~ent
series can be combined in one. The method involves the calculation
of weighted arithmetic average of the different means, using the number
of items in each case, as the weights. Thus, if the average of a series
is 20 and the number of items in it is 10 and the average of another_ series
is 25 and the number of items iD.!it is 15, the combined average of the
two series would be equal to the weighted average of these two averages,
the weights being 10 and 15 respectively (the number of items in each
case). The weighted arithmetic average would be : 
(20x;0)+(25x15) 23
10+ 15 or
The simple arithmetic average 10f. the two averages would be
2 or 22.5. ThIS'"IS an Inaccurate
20+ 25 >
It 15 mu1tIP
average, as. 1"f"' . I"Ie d
by the total frequency (now 25) it would not give the correct aggre
gate. If, however, we multiply the weighted arithmetic average or 23,
by the total frequency or 25, the product would be 575 which is the
total of the aggregates of the two series (200+375).
Discriminate weigbting. We have seen that in many cases the SImple
arithmetic average and weighted arithmetic average differ considerably,
and the question that arises is, which of the two averages should be used
in such cases to represent the series? For this, it is necessary to study
the weights of the items in relation to their si2:e. Sometimes it would
be found that big items in a series are associated with big, weights and
small items with small weights. In such cases weigh~d arithmetic
average would be more than the simple arithmetic average .... Thus the
simple arithmetic average of natural numbers 1, 2, 3, 4, 5, 6, 7, 8, 9,
and 10 is 5.5, and if these numbers are associated with weight~ whose
respective values are 1,2,3,4,5,6, 7, 8, 9, and 10 the weighted arithmetic \
average would be 7.0.
If, on the other hand, big items are associated with small weights
and small items with big weights, the weighted arithmetic average
would be less than the simple arithmetic average. If the weights in
the above case were respectively 10, 9, 8, 7, 6, 5, 4, 3, 2, and 1 the
weighted arithmetic average would be 4.0 whereas the simple arith
metic average is 5.5.
Chance weighting. If weights are indiscriminately associated with
values or, in other words, if big items are associated with both big and
small weights and similarly small items with both small and big weights,
the weighted average and the simple average would not materially d11f~r.
Thus if for the'values of 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10 the weights
wett respectively 10, 3, 6, 4, 5, 8, 2, 1, 9, and 7 the weighted arith
metic average would be 5.4 and the simple arithmetic average is 5.5.
MEASURES OF CENTRAL TENDENCY 159
. 1og. (42.9129)
=antl 20
= antilog. 2.1456
= 140 (to the nearest whole number)
The method discussed above is the direct method. We have
seen in the calculation of simple geometric mean that a shortcut method
can be used by assuming a geometric mean. Weighted geometric
mean can also be calculated by the shortcut method. The deviations
of the logs. from the log. of the assumed geometric mean are multi
plied with their respective weights, and the sum of the products is divided
by the total of the weights. The resulting figure is added to the log.
of the assumed geometric mean. The antilog. of this figure wonld
give the actual weighted geometric mean of the series.
where d' stands for the deviations of the logs. from the log. of the
assumed mean.
MEASURES OF CENTRAL TENDENCY r61
Items Weight
t 5
.5 10
10.0 20
45.0 10
175.0 15
.ot 2
4.0 15
11.2 8
Soilltion. CD1/Jplltation of the wlighted harmonic mlan
Items ReCIprocals Weight WClghtXftecl.
1 1.0000
 5
/
procals
5.0000
.5 2.0000 10 20.0000 )
10.0 .1000 20 2.0000 (
45.0 .0222 10 .2220
175.0 .0057 15 .0855
.01 100.0000 2 200.0000
4.0 .2500 15 3.7500
11.2 .0893 8 .7144
85 1231.1719
1:w
. ___ , 1:.,,
ReClp~ };W
• cal 231.7719
... ReClpro 85 . , 2727
.. RcClproQU.
.3663
QueStions
1. What is meant by measures of central tendency? What are the characteristics
of a good measure of central tendency i'
2.. Define arithmetic average, geometric mean, median and mode. Which of
these is most roprosentative and why i' (M. Cam. Au'" 1945).
3· What is a statistical average? What are the desirable properties £01' an ave
rage to possess? Which of the averages, you know, possess most of these proper_
ties? (M...4. Delbi,19H).
4. What are the algebraic properties of the arithmetic average?
. ,. Define weighted average. How does it ditIer from a simple average? Is a
weIghted average better than a simple one? Give reasons.
6. Discuss critically the use of weighted mean in statistics.
(B. CDIII. Cal&tll/", 1937).
,. What are the algebraic .properties of the geometric mean? Is it a better
average than median and mode i' If So how?
8. Compare ~nd contrast the relative merits and demerits of the variouS measur es
of central tendency which you know.
MEASURES OF CF.NTRAL TENDENCY 163
29. Make a frequency table having grades of wages with class intervals of two
Annas each from the following data of daily wages received by 30 labourers in a certain
factory and then compute the average daily wages paid to a labourer.
Daily wages in annas,
14, 16, 16, 14, 22, 13, 15, 24, 12, 23,
14, 20, 17, 21, 18, 18, 19, 20, 17, 16,
15, 11, 12, 21, 20, 17, 18, 19, 22, 23.
(B. A. Hons. PUlljab, 1945).
30. The following table gives the monthly average of automobile production
in the United States for the year 19261932 (unit 1,000 cars).
Year Production Year Production
1926 358.4 1950 279.7
1927 283.4 1931 199.1
1928 363.2 1932 114.2
1929 446.5
Calculate the average per cent of change per year.
31. The following is the table of the age of 30 adult. persons
Years 1 2 3 4 5 6 8
2029
°
2 1 2 2 1
7
1 1
9 Total
10
3039 2 1 2 1 2 8
4049 2 2 1 1 6
5059 1 2 1 4
6069 1. 1 2
Thus there' are two persons of 23 years, one of 57 years and so on.
Find out the mean of the series
(a) by using only totals of class intervals.
(b) by using the entire data
32. A candidate obtains the following percentages in an examination: Sanskrit
75 ; Mathematics 84 ; Economics 56 ; English 78 ; Politics 57 ; History 55 . Geo
graphy 47. It is agreed to give double weight to mlrks in English, Mathemati~s and
Sanskrit. What is the Weighted and unweighted mean ?
33. Explain what is meant by weighted ave rag,?, and discuss the effect ofweighting..
Calculate (i) the unweighted mean of the pn.ces in column III and (ii) the mean
obtained by weighting each price by the quantIty consumed.
I II III
Articles of food Quantity consumed Price in rupee per
maund
Flour 11.5 rods. 5.8
Ghee 5.6 mds. 58.4
Sugar .28 mds. 8.2
Potato .16 mds. 2.5
Oil .35 mds. 20.0
(M. A. Cal., 1937).
MEASURES OF CENTRAL' TENDENCY
34. The following table gives the number of employees and their monthly earnings
in two factories of a particular city :
A B
....
Description No. of Monthly No. of Monthly
of workmen employees earnings emplQyees earnings
Rs. Rs.
(0) 3 800 2 750
(b) 20 145 10 150
(f) 15 50 15 60
(d) 25 30 25 50
(e) 80 35 40 40
(f) 250 20 120 20
Compare the weighted average.
35. Suppose that an automobile makes a 200 mile trip, covering the first 100
miles at the rate of 50 miles an hour and the second 100 miles at the rate of 40 miles
an hour. What is its average speed ?
36. A railway train runs for 30 minutes at a speed of 40 miles an hour and then,
because of repairs of the track runs for 10 minutes at a speed of 8 miles, an hour, after
which it resumes its previous speed and runs for 20 minutes except for a period of 2
minutes when it had to run over a bridge with a speed of 30 miles per hour. What
is its average speed ?
37. The following table indicates the increase in cost of living over July 1946,
for a working class family as at 1st January 1955, and the weights assigned to various
groupS.
38. The table shows the age distribution of married females according t9 sample
census of 1941 in the Baroda State.
Calculate the median age of married females and also the two quartiles.
(T. A. & A. S •• elr., EXalll., 1942)
FUNDAME.Nl'AT,~ OF Sl',o\TISl'ICS
39. Calculate the values of the median and the two quartiles for the following :
40. Calculate the mean and median for the following distributioo.
41. \he foll~wing ?ble !!ives t?e distribution of the male. and female popu!at:r:::!.
of a certam area jfi India. By finding the mean age, the median age, and the ilpper
and lower quartile ages, make comments on the age distribution of the tWG sexes
in the area :
47. The following table gives the marks obtained by 65 students in Statistics in
:ertain examination :
Examination marks Number of students
More than 70% 7
60% 18
50% 40
40% 40
30% 63
20% 65
Calculate the median of the above series.
48. Find out the median of the following series
Wages No. of labourers
Rs.
6070 5
5060 10
4050 20
3040 5
2030 3
49. The following is the age distribution of candidates appearing at the Matr
culation and Intermediate Arts examinations of the Patna University in 1937.
_Age in yeats 12 13 14 15 16 17 18 19 20 21 22 Tota
Matriculation 5 48 189 303 522 980 981 794 515 474 X 481
Intermediate X X X 5 45 87 127 150 155 127 175 87
Compare the median and modal ages of the Matriculation candidates with thos
of 1. A. candidates. (M. A. Pallia, 1940
50. The following table shows the frequency with which profits are made. Wha
is the Mode ? I
Frequency
Exceedi.ng Rs. 3,000 and not exceeding 4,000 83
4,000 5,000 27
.. 5,000 6,000 25
" 6,000 7,000 50
.. 7,000 8,000 75
" 8,000 9,000 38
" 9,000 " 10,000 18
"
51. Find the modal wage group from the following table :
Wages in Rupees No. of labourers
Above 30 520
40 470
50 399
60 210
70 105
" 80 45
90 7
52. Find out the median and the mode for the following table
No. of days absent No. of students
Less than 5 29
10 224
15 465
20 582
25 634
" 30 644
" " 35 650
653
" " 40 655
45
"
, , 1.
MEASURES OF CENTRAL TENDENCY , '.
53. Find the median and mode from the following table :
Class Frequency Class
0 3 Frequency
4 1820 24
3 6 8 2024
610 14
10 2425 16
1012 14 2528
1215 11
16 2830 10
1518 20 3036 6
54. Find the modal wage from the following data :
Weekly Wage No. of wageeamers
Sh. d. Sh. d.
12 6 to 17 6 4
17 6 22 6 44
22 6 27 6 38
27 6 32 6 28
32 6 .. 37 6 6
37 6 42 6 8
42 6 47 6 12
47 6 .. 52 6 2
52 6 .. 57 6 2
(B. Com., Rajplliana, 1949)
55 1'lnd out the mode of the following 'Seri<;s ' 
Size 0 It n::. Frequency Size of item Frequency
09.99 10 4049.99 11
1019.99 14 5059.99 13
2029.99 16 6069.99 17
3039.99 14 7079.99 13
56. Calculate the geometric mean of the following figures : 
5, 10, 192, 14,374, 20,498, 1,20,674, 15,491
57. Compute the weighted geometric average of relative prices of the following
'~mmodities for the year 1939 (Base year 1938price 100) :_
Weight
Commodity Relative Price (value produced in 1938)
Corn 128.8 1,385
Cotton 62.4 819
Hay 117.7 842
Wheat 99.0 561
Oats 130.9 408
Potatoes 143.5 194
Sugar 125.6 142
Badey 150.2 100
Tobacco 101.1 103
Rye 116.2 25
Rice 117.5 17
Oil seeds 78.7 29
How does it differ from the unweighted geometric mean, and why ?
(B. Com., Alld. 1943)
~, 58. The following table gives index numbers for various items entering the cost
)£ liVing. Find an index of the cost of living by computing a weighted average of
;!lese items. The weights to be used are also given in Ithe table : 
FUNDAMENTALS OF STATISTICS
Table
Items Index Weight
1. Clothing 77.3 13
2. Food 74.5 43
3. Fuel and light 85.8 6
4. Housing 64.6 18
5. Sundries 92.5 20
59. Compute the geometric mean of the following series
Marks No. of students
otO 5
1020 7
2030 15
3040 25
4050 8
60
60. The annual incomes 'of fifteen families are given below in rupees : 
80, 2500, 90, 1200, 1450, 7200, 120, 1060, 150, 480, 360, 96, 200, 520 and 60.
Calculate the Harmonic Mean.
61. The following table gives(o) the total number of persons possessing hold
ing~ of different sizes and (b) the total area of land comprised in holdings of different
sizes in U. P. during the year ending on 30th June, 1945 :
Total number Total area in
Size of holdings in acres of hersons in thousands
t ousands of acres
Not exceeding .5 2,643 925
" .. 1
2
.. ....
Exceeding .5 but not 1
2
3
1,696
2,205
1,430
1,556
3,361
3,373
3 "
" .. 4 "
5 "
"
"
4
5
6
992
703
515
3,458
3,150
2,817
" 6 " " 7 378 2,446
" 7 " "
"
"
8 "
9
..
" ,"
"
"
8
9
10
283
216
171
2,112
1,830
1,617
" 10 12 206 2,264
" 12 " " 14 138 1,776
" 14 " " 16 96 1,424
" 16 " " 18 68 1,252
" 18 " " 20 51 972
" 20 " " 25 70 1,570
" over 25" " 115 5,310
Grand Total 12,276 41,113
(i) Calculate the average size of holdings in the U. P.
(ji) Assuming the minimum size of an economic holding to be 10 acres
(1) Calculate the percentage of the area under uneconomic holdings in 1945 in
the U. P.
(2) Calculate the percentage of persons having uneconomic holdings in the
U. P. in 1945. (P. C. S. 1951)
62. (0) Define a 'weighted mean.'
If several sets of observations are combined into a single set show that the means
of the combined set is the weighted means of the several sets.
(b) The number of asthma sufferers whose first attacks came at various ages is
given in the follOWing table. CaIculate the mean age at the first attack by any method.
MEASURES OF CENTRAL TF.NDENCY
T1\.aLE
Age at
first 05 510 1015 1520 2025 25303035 3540 4045 4550 5055 5560 6065
attack
Number
of cases 298 113 64 61 70 81 I 77 64 53 40 35 24 20
(I. A. S. 1955)
63_ Fi~d the mean, mode, standard deviation and coefficient of skewness for
the followtng . 
Year under 10, 20, 30, 40, 50, 60.
No. of persons 15, 32, 51, 78, 97, 109.
(P. C. S. 1952)
64. What are the desiderata for a satisfactory average? Point out the special
characteristics of the arithmetic mean, the median a~d the geometric mean.
Explain the stepdeviation method for finding out the arithmetic mean of a
frequency distribution. Derive the useful formula and apply it to find the arithmetic
mean of the distribution.
Variate 5, 10, 15, 20, 25, 30, 35, 40, 45, 50.
frequency 20, 43, 75, 67, 72, 45, 39, 9, 8, 6.
~ vi (P. C. s. 1954)
6? The following table gives the monthly income of 24 families in a certain
locallty :
Serial No. of Monthly income Serial No. ot Monthly income
the family
1
in Rupees
60
the family
13
in Rupees
96
I
2 400 14 98
3 86 15 104
4 95 16 75
5 100 17 80
6 150 18 94
7 110 19 100
8 74 20 75
9 90 21 600
10 92 22 82
11 280 23 200
12 180 24 84
Calculate the arithmetiC average, the median and the mode of the above incomes.
Which average would represent the above series the best? Give reasons.
(P. C. S. 1955),
/66. Figures concerning the number of deatbs in two towns in a particular year are
given below : 
Town A Town 15
~gegroup No. of persons Deaths No. of person s Deaths
In years. living living
010 500 100 12,000 4,800
1020 3,000 150 6,000 360
. 2030
3040
over 40
7,000
10,000
19,500
200
300
750
9,000
25,000
48,000
180
250
576
Total 40,000 1500 1,00,000 6,166
Compare the health conditions in both towns.
(P. C. S. 1955)
174 FUNnAMRNTAIS 0.1' STATISTICS
67. You are given the following statistics of population and unemploymentin ;
(0)" Your country as a whole for a standardised age distribution.
(b) The local administrative area in which you live.
Calculate (i) the standardized unemployment rate in the country as a whole, (ii)
the standardised ratC of unemployment in the local Mea and (iii) the crude rate of
unemployment in the local area.
Age (Years)
16;3lr 304S 4560 6075 Total
Standard population
Age constitution 250 350 300 100 1,000
Unemployment rate
per cent 5 8 12 15 
Local population
Age constitution 300 300 350 50 1,000
Unemployment rate
per cent 4 9 12 20 
(P. C. S. 1956).
68. Fifty items sold in Department A of the Comer Store had a mean price of 30
rupees. Seventyfive items sold in Department B had a mean price of 20 rupees. The
mean price of commodities sold in Departments A and B was 24 rupees. Is it right?
69. If Xl and,)(2 are two positive values of a variate, prove that their geometric.
mean is equal to the geometric mean of their arithmetic and h:trmonic means. .
70. (0) An examination candidate's percentages are' ; English, 73; French, 82;
Mathematics, 57; Science, 62; History, 60; Find the Candidate's weighted mean if
weights of 4, 3, 3, 1, 1 respectively are allotted to the subjects.\
(b) The average percentages for the same examination were 57, 52, 48, 55, 50
for the above subjects respectively. Find the weighted mean for the whole examination.
71. "The inherent inability of the human mind to grasp in its entirety a large body
of numerical data compels us to seek relatively few constants that will adequately des
cribe the data."R. A. Fisher.
Comment.
72. Find the Average ages of men And WOmen blood donors from the following
data : 
Age, years 1019 2029 3039 4049 5059 6069
Frequency, Men 3016 6894 9229 5714 3575 1492
Women 7845 16,008 13,107 9685 6374 2137
Age years 7079 8089 90 and over
Frequency, Men 170 9 1
Women 173 9
73. A candidate obtains the following percentages in an examination : Latin,
75; Mathematics, 84; French, 56; English, 78 ; Science, 57 ; History, S4 ; Geo
grapby 47. It is agreed to give double weight to the marks in English, Mathematics
and Latin. What is his weighted mean ?
74. Tbe frequency distributions of real income in rupees of the employees of a
big industrial concern, in two different periods. are as given below
Frequency
Income in Rs. Period t Period 2
050 90 200
50100 150 400
100150 100 120
150200 80 100
200250 70 150
over 250 10 30
500 1,000
!'dEASUllES OF CENl'RAL TENDENCY
175
The total income of 10 employees In the frequency class '':Iver 250' in Period 1 is
Rs. 3,000 and that of 30 employees in Period 2 is Rs. 18,000.
(a) Compute the mean and median incomes for the two periods.
(b) Write a very brief note on the .relative economic conditions of the employees
in the two periods, supporting your statements by analysis of the given
data, if, necessary.
(I) Every employee belonging to the top 25 per cent of the earners is required to
pay 1 per cent of his income to a worker's relief fund. Estimate the in
crease in contributions to this fund from Period 1 to Period 2; (1. A. S.1958)
75. The following are the monthly salaries in rupees of 30_ ~mployees of a firm:
139, 12.6, II4, 100, 88, 62. 77, 99, 10 3, 144. 148, 63. 69. 148, 132., II8. 142.
16, 12.3, 104,95, 80,85, 106, 12.3, 133, 140, 134, 108,12.9.
The firm gave bonus of Rs. 10, 15, 20, 25, 30 and 35 for individuals in the res
pective salary groupsExceeding 60 but not exceeding 75, exceeding 75 but not ex
ceeding 90 and so on upto exceeding 135 but not exceeding 150. Find out the average
bonus paid per employee. (B. Com., B. H. U.)
76. For a certain group of 'Saree' weavers of Banaras, the median and quartile
earnings per week are Rs. 44.3. Rs. 43.0 and Rs. 45.9 respectively. The earnings for
the group range between Rs. 40 and Rs. 50. Ten percent of the group earn under
Rs. 42 per week, 13 percent earn Rs. 47 and over and 6 percent Rs. 48 and over. put
these data into the form of a frequency distribution and obtain an estimate of the mean
wage. (P. C. S., 19~6).
77. From a frequency.distribu.tion of marks in AcCOunts of 100 students, mean
was found to be 35. Later It was discovered that the marks 35 were misread as 25
Find the concet mean.
78. From the following data. find the missing frequency.
No. of Tablets. 4  8  I2.  16  2.0  2.4  2.8  32.  36  40
No. of Persons cured II 13 16 14 9 17 6 4
The average number of tablets given to cure fever was 20.
79. Calculate the Median, Quartiles, 6th Deciles and 70th Percentile from the
following data : 
Marks less than 80 70 60 SO 40 30 2.0 10
N? of Students. 100 90 80 60 52 2.0 13
(B. Com., Raj., 1951).
80. (a) From the data given below, find the mode:
(b) If the mode and the median of a moderately asymmetrical series are 16
inches and 20.2 inches respectively, compute the most probable median.
(D. C01t1., Delbi, 1960).
SI. Recast the following cum_ulative table inlO the form of ~n "cdinary
frequency distribution and determme the value of Mode by usmg formula
Mean.Mode.= ~(MeanMcdian). '~ 
176 FUNDAMENTAlS OF STl'TISTlCS
No. of days absent No. of students No. of days ab No. of students
sent
   _  __    ... _  _ _ ....
Less than 5 29 Less than 30
[0 224 H
[5 46 5 40
20 582 45
25 634
(B. Com., Luckno1Jl, 1957)
82. A taxicab drives from a plaintown to a hillstation, 60'miles distant, at a
mileage rate of 10 miles per gallon of petrol and on the return trip at 15 miles per gallon.
Find the harmonic mean rate of mileage per gallon. Verify that this is the proper
average in this particular case.
83. An aeroplane flies around a square the sides of which measure too m~les
each. The aeroplane covers at a speed of 100 miles per hour the first side, at 200 mdes
per hour the second Side, at 300 miles per hour the third side and at 400 m.p.h. the
fourth side. What is the average speed of the aeroplane around the square ?
84. A train moves first to miles at the rate of 10 m.p.h. next 20 miles at the rate
of 30 m.p.h .• and then due to repairs in the track another 5 miles at the speed of 5
miles per hour. It covers the last 15 miles at the rate of 10 miles an hour. Find the
average speed of the train per hour.
85. The mean wage of 50 labourers working in a factoil is Rs. 38. The mean
wage of 30 labourers working in the morning shift is Rs. 40. Find the mean wage
of remaining 20 labourers working in Evening shift.
86. The teachers of statistics reported mean examination marks of 37.5, 41 and
42 in their classes which consisted of 32, 2.5 and 17 students respectively. Determine
the mean marks for all the classes taken together.
87. The following table gives the distribution of the average weekly wages of
100workers in a factory. Calculate (i) Average weekly total wage bill of these
workers; (ii) The weekly wage of a worker whose wage is greater than that of
75% workers.
Weekly wages 1620 ZI2l 263 0 3135 3640 4145 4650
No. of workers 7 12 It 8
Weekly wages 56 60
No. of workers
88. The monthly incomes of 8 families in rupees in certain locality are given
below. Calculate the Mean, the Geometric mean and Harmonic Mean, and confirm
that the relationship a > g > h holds true.
Family A IB C D E I F I G I H
Income : (Rs. J 70 r 10 500 1 75 8 1 25 0 1 8 I 42
(Sagar, B. Com., II,1965)
Calculate 3.4 and 5 yearly moving Average from the following data : 
Years 19P 152153 I 54 155 I 56 I 57 I 58 I 591 60 I 61 16 2 1 6 3 164 16,
Value 18 I 20 I 22 I 25 I 30 I 37 I 38 I 38 I 40 I 43 I 45 I 4 6 r 4 8 I 49 I F
MEASURES OF CENTRAL TENDENCY 177
There is a member A such that there are twice as mlIny members older than /I.
a8 there arc members younger than /I.. Estimllte his age (in years upto two decimals.)
(M. A • .&0., Delhi. 1963).
91. The arithmetic mean. the mode and the meclian of a group of 75 observations
were calculated to be 17, H, 19 respectively. It was later discovered that one ob
servation was wrongly read as 43 instead of the correct value 53. Examine to what
extent the calculated values of the three averages will be affected by the discovery
of this e r r o r . . (M.A .• E&O •• Delbi. 1963)'
:;1. If the mode and the median of a moderately asymmetrical series is 166 and
15.6 respectively. what would be its most probable median? (8. CDm., AgrtJ, 1960).
93. Under what conditions weighted average is 0) equal to simple a~e, (ii)
greater than simple av~tage and (iii) less than simple avcrage. lllustrate your answef
with the help of examples.
• 94. (a) A train starts from rest and travels successive quarters of miles at ave
ragc speed of IlZ, x6, :t4 and 48 miles per hour. The average speed over the whole
mile is 19.7. m.p.h. and not 15 m.p.h.
(b) The price of a commodity increased by 5 percent from 1954 to 1955.
by 8 percent from 1955 to 1956 and by 77 percent from 1956 to 19n. The llvcragc
increase from 1954 to 1957 is quoted as 7.6 percent and not ~o percent.
Explain the two statements as you would to a layman and verify the arith.
metic mean. (M. COlli. Agrll. 1962)
95. If arithmetic mean of two cumbers is 20 and their geometric mc:atl IS 16. line
the harmonic mc:atl.
Measures of Dispersion 10
Need and meafliflg. In the preceding chapters we have already
discussed why it is necessary to tabulate and classify statistical series
and to condense them into a single figure called average. The average
as we have already seen has its own limitations and even an ideal average
can represent a series only" as best as a single figure can". No doubt
averages have a very great utility in statistical analysis but they fail
to reveal the entire story of a phenomenon. There may be a dozen
series whose averages may be identical but which may differ from each
other in a hundred ways. Obviously in such cases further statistical
analysis of the data is necessary so that these differences between various
series may also be studied and accounted for. If this is done statistical
analysis would be more accurate and we shall be more confident of our
conclusions.
Suppose there are three series of nine items each as follows :
In the first series the mean is 40 and the value of all the items
is identic~l. The items are not at all scattered, and the mean ,fully
discloses the cha.racteristics of this distribution. However, in the
second case though the mean is 40 yet all the items_of the series have
different values. But the items are not very much scattered as the
minimum value of the series is 36 and the maximum is 44. In this
case also mean is a good representative of the series. Here mean
cannot replace each item yet the difference between the mean and
other items is not very significant. In the third series also, the mean
is 40 and the values of different items are' also different, but here the
values are very widely scattered and the mean is 40 times of the
MEASURES OF nYSPERSION 179
smallest value of the series and half of the maximum value. Obviously
the average dves not satisfactorily represent the individual items in
this group. In order to have a correct analysis of these three series,
it is essential that we study something more than their averages because
averages are identical and yet the series widely differ from each other in
their formation. The scatter in the first case is nil, in the second case
it. varies within a small range, while in the third case the values ragge
between a very big: span and they are widely scattered. ItTs'Cvldent from
the above, that a study of the extent of the scatter round an average should
also be studied to throw more light on the composition of a series/. The
name gillen to this scatter is dispersion.
Dispersion in a general sense. Dispersion, thus, refers to the variability
in the size of items. It indicates that the size of items in a series is not
uniform. The value of various items differs from each othe1. If thus
variation is substanti~l dispersion is said to be considerable and if the
variation is litt~e dispersion is insignificant. This is rather a general sense
in which this terni is used. If there is a series in which the scatter of the
value is much, say, from 100 to 1000, this series would be said to have
more dispersion than the one in which the values range only from 100
to 2.00.
Vispersion in a precise sense. The term dispersion not only gives a
g~r;. ral impression about the variability of a series, but also a precise
me ."ure of this variation. Usually in a precise study of dispersion, the
deviations of size of items from a measure of central tendency are found
out and then these deviations are averaged, to give a single figure re
presenting the dispersion of the series. This figure can be compared
with similar figures representing other series. It goes without saying
that such comparisons would give a better about the formation of
series than a mere ('omparison of their averages.
unit of the original data. In the above case the average income would be
referred to .. S Rs. 12.0 per month and the rdative dispersion ~ or '167
120
or 16.7%. In a comparison of the variability of two or more series, it is
the relativt: dispersion that has to be taken into account, as the absolute
dispersion may be etroneous or unfit for comparison if the series are
originally expressed in different units.
Measures of dispersion
The following measures of dispersion are in common use
I. Range
2. InterQuartileRange
3. SemiInterQuartileRange or Quartile Deviation
4. Average Deviation or Mean Deviation
5. Standard Deviation or RootMeanSquare Deviation taken
from the mean.
We shall discuss them in turn.
RANGE
Range is the simplest possible measure of dispersion. It is the
difference between th~ vallies oj. the.!..?f1!e1Jle.i1!J!;,LojEJ..e.r.iM: Thus if in a series
rerat1t'ig to the weight measurements of a group of students the lightest
student has a weight of 90 pounds and the heaviest of 240 pounds the
value of range would be 150 pounds. This figure indicates the variability
in the weights of students. The distance on the scale measuring 150
pounds would include the weight of every student. If the data are given
in the shape of continuous frequency distribution, range is the difference
between the lower limit of the smallest class and the upper limit of the
biggest class.
Range as calculated aboveis an absolute measure of dispersion which
is unfit for purposes of comparison, if the distributions are in different
units. For example the range of the weights of students cannot be
compared with the rang(. of their height measurements as the range of
weights would be in pounds and that of heights in inches. Sometimes,
for purposes of comparison, a relative measure of range is calculated.
If range is divided by the sum of the extreme items, the resulting figure
is called "The Ratio of the Range" or "The Coefficient of the Scalier."
Merits, demerits and uses of range
A good measure of dispersiort should possess the same qualit:es
which were laid down in the Ilj.st chapter for a good measure of central
tendency. A good measure of dispersion should be rigidly defined,
easily calculated, readily understood and further, should be capable of
algebraic treatment and should not be affected much by the fluctuations
of sampling.
The only merits possessed by range are, that it can be easily calculated
':::.. and readily understood. As against these, there are many drawbacks from
which it suffers. The most important point against range is that it is
HEASyRES OP DISPElt STON 181
500/0 of the values, a percentile range which takes into account, say, the
90th and the 10th percentiles would give a better measure of dispersion
than either of these two. If the difference of the 90th and the 10th
percentiles is found out it will be called 1090 percentile range. Un
lik:e range it has the advantage of not being affected by the values of the
extreme items of a series and it also does not leave aside 50% of the
values as the intetq uartile range does. A 1090 percentile range would
leave only 20% of the values at the extremes. It, however, suffers from
most of those defects from which range and interquartIle range suffer.
SEMIINTERQUARTlLE RANGE
Semiinterquartile range
or
Quartile deviation
Where Q'A and Ql stand for the upper and lower qua{tiles respectively.
In a symmetrical series median lies half way on the scale from Ql
to Qa. If, therefore, the value of the quartile deviation is added to the
lower quartile or subtracted from the upper quartile, in a symmetrical
series, the resulting figure would be the value of the median. But
generally series are not symmetrical and in a moderately asymmetrical
s~ries Ql+ quartile ~eviation or Q3 quartile deviation, would not give
tne value of the median. There would be a difference between the two
figures and the greater the difference, the greater would be the extent of
departure from normality. .
Quartile deviation is an absolute measure of dispersion. If it
is divided by.the average value of the two quartiles, a relative measure
of dispersion IS obtained. It is called the Coefficient of Quartile Deviation.
/2aQl
Symbolically 2
Coefficient of a quartile deviation = Q2+ Q'8 =Qa Ql
2 Qa+Ql
The following example would clarify the procedure of the calcu
lation of the quartile deviation and its coelfficient : 
Example 1. Calculate the SemiInterQuartile Range and its
coefficient of the marks of 59 students in Economics given below.
MEASURES OF DISPER.SION 183
... 40 + 5040
1 2 (44'2538)=45'2 marks
Semiinterquartile range =
Q a 2 Q 1 =
\/44'222'5
2 = 10.85 marks.
be 33.35 and if they are subtracted from the upper quartile it will again be
33.35. The actual value of the median is 34.33. It shows that the series
is not perfectly normal though the department from normality is not much.
It, however, reveals that the dispersion of items on the two sides of the
median is almost equal.
Merits and drawbacks of quartile deviation
The quartile deviation possesses the merits of simple calculation and
easy understandability. It is commonly understood and its calculation
.does not involve any mathematical intricacies. These are the points in
\favour of quartile deviation but there are a large number of points which
go against it. Quartile deviation is neither based on all the observations
of the data, nor is it capable of further algebraic treatment. It is affected
to a cousiderable extent by the fluctuations of sampling. A change in
the value of a single item may in certain cases affect its value considerably.
Thus quartile deviation is not a very good measure of dispersion, parti
cularly for series in which the variation is considerable. However, for
rough studies, '{uartile deviation may give an approximate idea of the
extent. of variabllity in a series.
n the direct method, as we have seen above, the mp.an deviation would
be calculated by totalling the deviations from the mean or median (plus
and minus ignored) and dividing this total by the nllJIlber of items.
In the shortcut method mean or median is assumed and the total of
the "allies of itWiS below the actual mean or median and above it are found
out. The former is subtracted from the latter and divided by the number
of items. The resulting figure is the required mean deviation.
Symbolically
81n= _:_(JIIY1IIx)'
~ n ,
Where 3m stands for the mean deviation from median, my for
the total of the values above the actual median, and mx for the values
below it, and n for the number of items.
1
(;,1 8=;; (ayax)
Where 8 stands for the mean devia~ion from mean, '!Y stands for the
total of the values above the actual arithmetic average and ax for values
below it. The following example would illustrate these formulae : 
Example 2. The following are the marks' ~btained by a batch of 9
students in a certain test :  I
Serial No. Marks Serial No. Marks
(out of 100) (out of 100)
1 68 5 54
2 49 6 38
3 32 7 59
4 21 8 66
9 41
Calculate the mean deviation of the series.
Soilltion. Direct method. Calculation of mean deviation of the series
of marks of 9 students (arranged in ascending order of magnitude).
I 1.)eVlattons frcJlllmedian (4~)
Students Marks (+and signs ignored)
(m) (dm)
1 21 28
2 32 17
3 38 11
4 ~ 8
5 ~ 0
6 54 5
7 59 10
8 66 17
9 68 19
r.dm = 115
MEASURES ()il DISPERSION 187
n\=1.
Median=value of  2  ltems
= 49 marks
.. ~ "l'..dm
Mean devlatlon or um = 
n
Where "l'..dm represents the summation of the deviations from the
median; and n, the number of items
115
Sm=9 marks =12.8 marks
So/tllion
Calculation of Mean Deviation ftom the arithmetic average,
Prices Rs, Deviations from arithmetic average
(Rs, 100,425)
(+andsigns ignored)
(111) (d)
100.500 .075
100.250 .17S
100,375 ,050
100.625 ,200
100.750 ,325
100,125 ,300
100,375 ,050
100.625 ,200
100.500 .075
100.125 .300
:Em 1004,250 Ed ... 1.750
. hm .
Arit etic average
Em
n
== 1004.25
1.0
R 100 4.25
 s. '1
.. 1004.250
ArIthmetIc average or a 10 "'" 100.425
Number of items smaller than arithmetic
average or fiX = 5 and their total
or ax  501.250
Number of items bigger than arithmetic
average or '!15 and their total
or ~ ... 503.000
1
Mean deviation ,,(aYflyxa)+(nxxaax)
1
n(ay IlK)
1 1.750
= 10 (503.000501.250)  """'10
 .175rupees ..
. .
Mean d eVlatlon = l:.fd
I:.f =
472
50 mark s. =.
9 44 marks.
Adjustments
Number of items less than the !lctual arithmetic average (27)=28
'192 FUNDAMENTALS OF STATISTICS
Diretl Mlthod.
MEASUR.ES OF, DISPERSION 193 
. hmetlc
Atit . average=a + 'SfdX • 25+('864
~XI=l. 2f23 x 5 ) = 10.5
64
MEASURES OF DISPERSION 195
Midvalue
I
= Deviation 'fotal Deviation
from the deviations from the Total
I
Agegroup of the Frequen as. avo from the average devia
group cy (19.5) as. avo (20.7) I tions
(+ &signs!from the
ignored) , average
(m) I(m.v.) (f) (dx) I (/dx) Cd) Cfd)
1516 15.5 0 4 0 5.2 0
1617
!
, 16.5 '1 3 3 4.2 4.2
1718 I 17.5 3 2 6 :'\.2 9.6
1819 18.5 8 1 8 i 2.2 17.6
1920 19.5 12 0 0 1.2 14.4
I
2021
2122
2223
I 20.5
21.5
22.5
14
14
5
+1
+2
+3
+14
+28
+15
.2
.8
1.8
2.8
11.2
9.0
2324 2~.5 2 +4 + 8 2.8 5.6
2~25 24.5 3 +5 +15 3.8 11.4
2526 25.5 1 +6 +6 4.8 4.8
2627 26.5 0 +7 0 5.8 0
• 2728 27.5 1 +8 + 8 6.8 6.8
.
,I n64 l:.fdr = . j}:.jJ='97. 4
I +77
.h
A nt .
metlcaverage or,a=x
+. l:.fdx =19.5+64"
n 77
=1.0.7 year$.
Symbolically
a = jL:~
Where CT stands for the standard deviation, ~d2 for the sum of
the squares of the deviations measured from the arithmetic average
and n for the number of items.
Difference between' root mean osquare deviation and standard deviation.
Various terms like· Mean Error, Mean Square Error_and Error of Mean
Sqllare are used to denote the value of standard deviation. We shall
be using the term standard deviation only as it is most" popularly used.
Some writers use the term rootmeansquaredeviation to denote the stan
dard deviation. This is technically wrong, bec~use the standarddeviation
is only one of the many values that the rootmeatJ..square.deviation
Cll n take. Rootmeansqllarede.uiatiofl is tbe sqllare root of tbe arithmetic
average of the sqllares of deviations measllred from a'!Y arbitrary vallie. If the
deviations are measured from the arithmetic average there is no difference
between rootmeansquare:...deviation and the standard deviation;
in' other Words, standard deviation is the rootmeansquaredeviation
0,
mea'sured from the arithmetic average. If deviations are not measured
from the arithmetic average but from some other value we can find out
the value of the standard deviation from the value of the rootsquare
deviation. In fact the shortcut method of calCUlating the standard
deviation is based on the relationship between standard deviation and
rootmeansquare deviation. We ~ball discuss this point a little later.
198 FUNDAMENTALS OF STATISTICS
 ..
sfancTard DeViation or (1
J~mI
 nl:(mj'/n 
'39,764(630)1/10 ,j'W'
=,J 10  10
= 2.72"
Shortcut metbod. Standard deviation can also be calculated by
a shortcut method. Here the deviations from an assumed average
are calcula.ted and squared. Their sum. is divided .by the number of
items, or in other words, the arithmetic average of the square of devia
tion!> from the assumed average is found out. From this ligure the
square of the arithmetic average of the deviations from the assUllled
mean is subtracted. The square root of the resulting figure is the
standard deviation.
~2_ (~dX):P
Symbolically 0=
j "

n
Where dx 'stands for the deviation from the assumed mean.
Example No.9 would be solved by this method.as follows:
·Proof
nand $= J'DJx
J ~~I 2
Let 0=  , ,  and c=(ax)
2 l
(Ill = '2.fd and $2 = '1:.fdx
n "
(111=$II_CI
s.
As would always be greater than (II, the root meansquare devia
tion from mean would always be le~s than the root mean square
deviatiofJ from any other point.
200 PUNDAM:e;N'l'ALS 01': STATISTICS
Shortfill M,thaa
si:te Of items Deviacl"Ons from Square o f 
assumed mean 62 deviations
60 
(dx) ....
,    ' (~);.,.
60
2' I 4
61
62
2
1
o
I
I
1
0
63 1 \ 1
63 1 1
63 I 1 1
64 2 4
64 2 4
70 8 64
Total +10
0  J};:' (~ y
a J 84 (~)IV841
10 10 .
 v 7.4 =2.72'
This formula ,can be 'Written ih the following way~ also
(I)
Then dXdf
(tlx2)_ (tI+ &)'= tl s+ 2&J+ &~
L(dxY'=;l:dl +:t2ed+el
but ~d=:O
:. Z(dx)B=l::dl+ncl
I "i.(dX)1 = };dl _+ el
n n
l::d' l::(dx)1
= e·
n "
= (ax)!
n n
MEASURES OF ruSPE'RSION 2.01.
(ii)
(i) a
J 8410(6362)1 
10
J 8410. '74
10 v  ·
2.72'
.(ii) a ...
J~ Iu (6362)' J 8! 
10
1
~'\I~2,72
o =J~~2
The following illustration would clarify this procedure : 
Example 10, Calculate the standard deviation from the following
data : 
Size of item Frequency Size of itetp f're'iaency
6 3 10 ..i
7 6 11 5
8 9 12 4
Q
13
202 l'UND~N1.'ALS 'OF STATISTICS

items
(m)
6
I quency
,(f)
3
I
Frequency from the
(mf)
.'~'
average (9)
(d)
squared
up
(JI)
3
square .of
deviations
(fdt )
9 Z7
1 6 42 '2 4 24
8 9 72 ~1 I 1 9
9 13 117 0 I 0 0
10 8 80 +:1 1 8
11 , 5 55 +2 4' 20
12 4 48 +3 9 36
11==48 :Emf=432 :Ejdi 124
 ~
. hm . 1:.mf 432
A rlt etlc average =   = __ =9
n 48
Standard DeviatlOn
''i:.ftlxl
u=
J n(ax)1I
=x+
:EJd« = 22
n
+ 38 =22.38 articles
100
Standard deviation
. . j j1X:
:E n(ax)1I
= j49010~06·38)~·~ ,;;[756
= 2.2 articles app rox.
OJ'
a=J~a_(:Ef:X2)
= )
490 C
38\2 
100 'tOO} = ,\/4.9.144= '\14.756
=2.2
204 FUNDAMENTALS OF STATISTICS
Age
group
\
value
I
Mid ' Freque
ncy
Deviat~ons
from the
assumed
Total
deviations dev iations

Square of Frequency
X square
I aVo (55) fJxI
(m) (11/4)) ~f) dx fdx dx"
2030 2 5
3040 35
3
61
30  90 900 2/UU
24400
20 1220 400
40'5"0 I 45 132 10 1320 100 13200
5060 55 153 0 0 I 0 0
6070
7080
65
75
140
51
+10
+20
+1400
+1020
I 100
400
14000
20400
8090 85 2 +30 + 60 ,900 1800
n=542 "'i:.fdx= "'i:.fdx"=
150 I 76500
MEASURES OF DISPERSION 205
150
(ax) .... 542 = .28
=J 76500:542(.28)1
542
 v'i4iJ57
11.9
The following metpod 'will also give us the same result. ."
. , 0 f t he age d
Standard d eVlatton "b
Istn '
utton = j"J:.fdx'l.
_ '   ("J:.fdX)'

1\ n n
= j 76500
542
(150
542
)2 = v' 141.07
.... 11.9 years
Example 1.3. The following data relate to the ages of a group
of Government mployees. Calculate the standard deviation. I
• Age Number of employees Age Number of employees
5055 25 3035 80
4550 30 2530 110
4045 40 2025 170
3540 45
SoJlltion. Cal&1llation oj standard tktIiation
u= J
Standard deviation or
_==L""':j:~X::z""'_(==~::~"""xy X i
_ j 500
2435 _ ( 635)2
5eO X
'5
= 9.0 years.
206 FUNDAMBNTALS OF STATISTICS
Whete a.· stands for the square of the §tandard deviation after
corrections, at for the square of the standard deviation before correction
and h. for the square of the magnitude of the class intervals.
u1.= 141.07
h=10
10·
Therefore ul =141.07 = 132.74
12
Therefore
" 1+n ll
Where all stands for the mean of a serie:; and a1 and a, for the means
of its component parts, and n1 and n. for the number of. items in tbe
two component parts respectively.
If. further 0'1 and 0'1 stand for the standard dcvi~tions of these
component parts and O'lf for standard deviation of the whole series
tben
+ d21)
0'11=
1 j nl (O'tl+dtl)+n.
n1 +n.
((1111
Series Series
A 13
Number of items 100 500
Mean 50 60
Standard deviation 10 11
SO/Illioll.
Combined mean or
all=
"1fNl
11 +11
+" 2m 2
1 II
(lOQx50) + (500 X 60) 35,000
   . 100+500 ._= 600
 58.3
MEASURES OP DISPERSION
100[(10)1+(8.3)1]+500[(11)1+(1.7)1 ]
• 100+501)
= j 600
16889+61850
""
,.~~
J13f.23 =11.5
0'11 j O'II+O'tl
2
• Thus, if in the above example, the number of items in each case
was 100 and if the mean in each case 'Was 50 the combined standat:d
deviation by the lirst method would have been
_j 100(100+oY.+ 1oo(121+0)
1 +100
 J 10000+ 12100
200
_j ~: _j~l
10.51
If we apply the second rule then
j _j ~1
~t
J
0'11+0', _
10.51
100+ 121
2 2
We can know from elementary algebra that the sum of the first
" natural numbers is
11 (,,+ 1)
2
14
210 FUNDAlomNTALS OF STATISTICS
1'1(1'1+ 1) (2n+ 1
6
Thus the sum of the squares of natural numbers 1 to 5 would
be 1+4+9+16+25 ... 55. It is equal to
5(5+1) (10+1)
6 ~
5 X 6:11 55
We have Seen in Example No. 9 (Direct method No.2) that
u "" J '1:.ma_<;m)l{w
~o11
4550 47.5 110
5055 52.5 1111
......... ... ............ .." ......... ............

200 200
The value of the mean when 43 was misread as 53 is given by
1
40 20(r(2.5/1+7·~I+ •..... +37.5/.+42.5f.+47.5fl.+52.5111+ ..... ·)
Let the value of the corrected mean be x.
 1
Then x .... 200 (2.5/1+ 7 •5/.+.,.+37.5/.+42.5 (/.+ 1)+47.5/10
+52.5 (jtll)+ ...... )
Let 2.5/t+ ...... +37.5!.+47.51111+57.5/1t+ ...... .r
1
Then 40 =2O(j'"(s+42.5/.+S2.5fu) or .r+42.5/.+S~.5fn 800(,l
 1
and x  200 [1+42.5 (/9+ 1 )+52.5 (/111)]
1
 200[1+42.5/.+52.5/11+42.552.5)
1 7990
 200 (BOOO10}  200 39.95
MEASURES OF DISpERSION 213
1 44850
2000 [45000150} = 200 224.25
aI=sldl where d is the difference between the actual and
assumed mean.
In this example s:l=81 =224.25 and d=(4039.95) =0.05
:.aI =224.250.0025
=224,2475
:.a =14.97
The corrected standard deviation corresponding to the corrected
distribution is 14.97.
ExtZlllp/e 17. The mean. age and standard deviation of a group
of 100 persons (grouped in intervals 10, 12, ... etc.) were ,found
to be 32.02 and _13.18. I.ater it was discovered that the age 57 was
misread as Z7. Find the correded mean and standard deviation.
Solution. The age 57 belongs to the group 5658 (midvalue
,57) and the age Z7 belongs to the group 2628 },midvalue27)1
, Let the misread frequencies of these two grou:ps be 1 and I.. Then
the corrected frequencies will be (/1+1) and (/rl) respectively. All
other frequencies have been entered correctly.
Midvalue Frequency (wrong) Frequency (correct)
57 11 11+1
27 I,. /. 1
214 FUNDAMEN'!'ALS OF STATISTics
1
~ 1000 [/+16075] :.1299301607513855
x 1~00 [/+(28Xl0)+(121X20)+(198X30)+(176X35)+
(27 X 50)]
30005
... fooo ... 30.005
4gain let T.li(xi29.93)2=T.; where I; are the
correct frequen
cies. Then second moment ahout 29.93, when errata was not
considered is given by : ~
82 = 1~ [T+(28X397.2049)+(121X98.6049)+(198X.OO49)+
(176X25.7049)+(27X~02.8049)
C  J2fl
Modulus is equal to standard deviation multiplied by the square
[oat of2 or
C=aXV2
Like standard deviation this measure is also based on the second
moment about the mean.
Precision
It is the reciprocal of modulus.
Thus
Precision ...
1
jv:'
Probable ettot
It is equal to .67449 X stanctard deviation.
Modulus, precision and probable errors are used in the theory of
errors of observations. We .shall discuss them in chapter!! on Sampling.
Standard deviation should not be confused with the term "Standard
Error" which stands for the standard deviation of simple sampling.
The. concept of standatd error will also be discussed in details in the'
chapters on Sampling.
Variance
It is equal to the square ~f the standard deviation or in other words
it is the second moment about the mean.
'218 FUNDAMENTALS OF STATISTICS
Coefficient of variation
It stands for the percentage, which the value of standard devia
tion is, to the value of the mean. In other words, if standard devia
tion is divided by the mean and multiplied by 100 we get the coefficient
of variation. This measure was first suggested by Professor Karl
Pearson. According to him, coefficient of variation is the "percentage
variation in the mean, the standard deviation being treated as the'tota}
variation in the mean,"
Symbolically
Coefficient of variation or V ... .!!_X 100
a
Coefficient of standard deviation X 100
Thus, if the mean of a series is 50 and the standard deviation is 10,
the coefficient of variation would be
10
SOX 100
or 20%
It means that the standard dcviation is 20% of the n.can.
Ginni's mea,n difference
Corrado Gipni, an Italian statistician, has suggested that instead
of measuring dispersion from any measure of celJtral tendency, the
mean dMrerence, between tne values' of all possibJe p~rs of the variable
should be found out, and it would give a good measure of dispersion.
Thus, thi~ measure of dispersion is equal to the mean difference (regard
less of algebraic signs) of each possibfe pair of the values of the variable.
Symbolically
Ginni's mean differen£e _l
m
Where g stands for the total of the differences in the values of all
possible pairs of a variable and m stands for the total number of diffe
rences. The tot~l number of differences would be equal to ,j n (nl)
The following example would illustrate the above formulae : 
Exampk 19. Find out Ginni's mean difference from the following
items : 
22, 24, 26, 28, .30.
SO/filion
3022=8 2822 ... 6
'3024=6 2824=4
3026=4 2826 ... 2
3028=2
Total .. 20 12
MEASURES OF DISPERSION 219
The mean deviation of the above series 2.4 and the standard devia
tion2.8.
Giani's mean difference is always more than the mean deviation
as it gives greater importance to extreme variations. The value of
Giani's mean difference lies in the fact that it studies the variations
(JIIJongII the values of a variable rather from a central value.
If the square root of the average of the squares of all dif¥erences is
Thus jZ::j~ x j 2( 55 1)
~/~xJ}
J 40 X 2..00:J20
5 2
d
tion.
_ :±: 1 stan :±: 2 stan ::I: 3 stan
dud dev;,, dard devia dard devia
tion tion tion
The range in both the cases is Rs. 2,000 a.nd the mean deviatigr,
is Rs. 666.7 in both the cases. The absolute measures of dispersion
are thus equal but the variation in the two series, is, in reality, not iden
tical. If, however, we calculate relative measures of dispersion diis
anomaly would be removed. The coefficient of range in the two cases
would be land 11 respectively and similarly the mean coefficient of dis
.
perslon wou1 2 and
d be 2 respective
63 ' 1y. In ..
comparIng .
dIsperSlon
9
of two series, expressed in different unitS, the use of relative mea
sures of dispersion is inevitable because absolute measures of dis
persion in such cases would be in different units.
Lorenz curve
.
Dispersion can be studied graphically also with the help of what
is called Lorenz Curve, after the name of Dr. Lorepz who first studied
the dispersion of distribution of wealth by the graphic method. The
technique of drawing Lorenz Curve is not very difficult. In it the size
of items and the frequencies are both cumulated and taking the total as
100, percen'tages are calculated for the various cumulated values. These
percentages are plotted on a ~raph paper. If there is proportionately
equal distribution of the frequencies over various values of a variate, the
points would'lie in a straight line. This line is called tpe "Uneo! Eqllal
Dirtribllfion." If, however, the distribution of items is not proportioll
'itelyequal, it indicates variability, and the curve would be away from
the line of equal distribution. The farther the curve is from this
line, the greater is the variability in the~ries. The following example
would illustrate the procedure of drawing'a Lorenz curve : 
Example 16, Draw a Lorenz curve from the t'ollowing data :
10
20 I
i
I
5
10
8
7
I 15
6
40 20 5 2
50 25 3 1
80 40 2 1
 
To draw the Lorenz Curve from the above data the size of the item
and frequencies would have to be cumulated and then percentages would
have to be calculated by taking the respective totals as 100., This is
MBASt11tlfS OF DISPERSI<?N 223
10 10 5 5 5 5 8 8 32 I 15 15 1 60
20
40
30
70
15
35
10
20
15
35
15
35
7
5
15
20
60
80
I6
2
21, 84
23 \ 92
50 120 60 25 60 60 3 23 92 I 1 24 96
80 200 100 40 100 100 2 25 100 I 1 25 100
~~
Now the cumulative percentages would be plotted on a graph paper.
Percentages relating to the number (Jf person would be shown on the
abscissa and from left to right the scale would begin with 100 and end
with O. The income percentages would be shown on the ordinate and
here the scale will begin without' the bottom and go up to 100 at the top.
The above percentages would give the following type of curve :
224 PUNDAlmNTALS OP STATISTICS
From the above figure it is clear that in the first group of persons,
the distribution of income is proportionately equal 110 that 5% of the
income is shared by 5% of the population, 15% of the income by 15%
of the population ancfso on. It gives the line of equal distribution.
In the second group the distribution is uneven so that 5% of the income
is shared by 32% of the people and 150/0. of the income by 6()0,4 of the
people. In the ttir~ group the distributIon is still more un~qual so that
5% ofthe income is shared by 60% of the people and 15%oftheincome
by 84% of the people. The variation in group C is thus greater than the
variation in group B. Curve C is thus at a greater distance from the
line of equal distribution, than ~rve B.
The Lorenz curve has a great drawback. It does not give any
numerical value of the measure of dispersion. It merely gives a picture
of the extent to which a series is pulled away from an equal distribution.
It should be used along with some numerical measure of dispersion. It
is very useful in the study of income distributions, distributions of land
and wages, etc.
Questiool
1. What is meant by dispenion? What are the methods of computing mra
sures of dispetsion ? Illustrate the practical utility of such methods.
eM. C_., Ail••, 194').
z. Explain the meaning of the term djspellion and distingui~ between absolute
and relative measures of dispellion. (B" C_•• Allaha/Hui. 1946).
3. Discuss the various ways in which the diifctences in the characteristics of
frequency djsttibutions ate generally measured. CB. C_ •• LIK_",. 1957).
4. Explain the various methods of describing the Idltter of a frequency distri
bution and say what you know as to the relztive worth of the relztive measures.
(B.U..,NfII1lW. I 944)·
5. Frequency distributions may either differ in the numerical size of their ave
rages thoogh not neccssatiJy in their formations or they may have the same valucs
orthe average but differ in their respective fonnations.
Explain and illustrate how the measures of dispersion afford a IUpplcmcnt to the
informatiOn about the frequency distributions given by the a~.
(M. C_ .• KlljJ1ldlrlltl. I.9S Z).
6. Ddine carefully the mc:an deviation. standard deviation and quartile devia
tion of any given distribution. In wbat problems should each be uacd ?
(M. A.. AlJ6habtu1. 1940).
7. What arc the mathematical properties of standard de"jation? How is it"
better measure of dispersion than the mean deviation or quartile deviation ?
8. What is meant by Sheppard's Coucctions? Under what c:onditiosls should·
these. corrections be made ?
9. Define dispersion. Why is it necc:swy to measure dispctsiosl in ord er to
make comparisons of frequency d,isttibutiona ?
10. What is range? What ate ita advantages and disadvantages as mcslUre of
dispCllion ?
n. Find directly the standard deviation of the natural aucibers &om 1 to 10
and VCtify the answer obtained by a abort cut method.
U. Write abort notcs on
(II) Lotens Curve (/1) Charlier's Check (f) Ginni's Mean Differcucc Cd) pre.
cision ee) Modulus (I) Root Mean Square deviation.
MEA ~URES OF o:.>PERSIO" 225
13. The following table gives weights of one hundred persons. Compute the
coefficient of dispersion by the Method of Limi(s.
Weight in lbs. of 100 persons
Classinterval No. of pefSons
85 95 4
95105 13
105115 8
115125 14
125135 9
135145 16
145155 17
155165 9
165175 8
175185 2
100
14. What arc the different measures of dispersion ? Th~ following table gives
the height of one hundred persons. Ca1culate the dispersion by Range Method.
Height of 100 persons in inches
Height in incht:s Frequency
'Below 62 2
63 8
64 19
65 32
66 45
67 58
68 85
69 93
70 100
"
15. The following are the marks obtained by a batch of 9 students in a certain
test : 
Serial Number Marks Serial number Marks
(out of 100) (out of 100)
1 68 5 54
2 49 6 38
3 32 7 59
4 21 8 66
9 41
Calculate the mean deviation of the series.
18. Calculate the mean deviation from the following data, what light does it
throw on the social conditions of the c:;ommunity ?
Difference in age between husband lUld Wife in It particular co~munity.
Difference in years Frequency Difference in yeatS Frequency
0 5 449 . : 2025 109
510 705 2530 52
1015 507 3035 16
1520 281 3540 .oJ
19. The following table gives the age distributions of swdents admitted to a col
lege in the years 1914 and 1938. Find which of the two,gri>ups is more variable in age.
Number of stl](fcnts admitted 1n
I
Age 1914 19.38
15 0 1
16 1 6
17 3 .4
18 8 :2
19 ·12 ·5
20 14 :0
21 14 7
22 5 .9·
23 .2 3
24 3 0
25 1 O.
.Q 1
26
27 1 0
ZOo Calculate quartile dniation and its coefficient of A's monthly eAminp.
for It year.
Months Monthly earnings Months Monthly earnings
Rs. ·Rs.
1 139 \7 160
2 150 8 161
3 151 9 162
4 151 10 162
5 1.57 11 173
6 158 12 175
227
21. From the following table giving height of student$ calculate the Semi
[nterquartile Range and' the Coefficient of Quartile Deviation.
23. Compute the standard deviation'of the rainfall in the varioQ.S jutegrowing
listricts of Bengal from the following statement : 
24. Calculate the standard deviation of the following two series. Which shows
:ceater deviation ?
25. Find standard deviation of the figures in the following table to show whether
he ","riation is great in the area or the yield ?
26. The index numbers of prices of cotton and COli.] shares ill 1942 "ere as under:
;l28 FUNOAMEN'l'ALS OF STATISTICS
31. Calculate the standard deviation for the following table giving the age dis
tribution of 542 members of the House of Commons.
Age No. of members
2~ 3
30 61
40 132
50 153
6~ 140
70 51
80 2
Total 542
I
32. The following table gives the frequency distribution of expenditure on food
per family per month among working class families in two localities. Find the arith
metic average and the standard deviation of the expenditUle at both places.
Range of expenditure No. of families
in Rs. per month Place A Place B
Rs. 3 6 28 39
6 9 292 284
" 912 389 401
" 1215 212 202
1518 59 48
1821 18 21
~~ ~ 5
(P. C. S., 41).
33. Find the mean yield of paddy and the standard deviation for the distribution
of the results of 3,061 cropcutting experiments shown in the following table 
Yield of paddy per acre in
Lbs. No. of experiments
0 400 236
401 800 481
8011200 604
12011600 576
16012000 419
20012400 333
24012800 217
28013200 87
32013600 64
36014000 23
40014400 14
44014800 6
48015200 1
3061
(B. Com., Bombqy, 1945).
34. Calculate the mean and standard deviation of the following series
Marks Number of students Marks Numbct of students
1 5 1 2125 7
610 18 2630 2
1115 25 3135 1
1620 26
230 I'UNDAU;ENTALS OF STATISTICS
35. Find out the mean and standard deviation of the following data : 
Age untler Number of persons Age under Number of persons
dying dying
10 15 50 100
20 30 60 110
30 53 70 115
40 75 80 125
36.Find out the coerlicient of variation of the following series :
Number of Number of
Income persons Income persons
More than 1000 0 More than 500 600
900 50 400 750
800 110 " 300 350
700 200 200 900
600 400 " 100 1000
37. Calculate the standard deviation of the following seri,cs:
Marks Number of students
More than 0 100
10 90
20 75
30 50
40 25
SO 15
60 5
70 o
33. Find out the m=an and variance from the following data : 
I
Factory .A Factory B
Wages No. of No. of
workers workers
Not exceeding Rs. 40 30 45
Exceeding Rs. 40 but not exceeding Rs. 80 25 35
80 120 30 25
120 160 45 40
160 200 25 25
200 240 13 20
240 280 24 5
" 280 320 8 5
Tot21 200 200
39. A collar ffi'lnufacturer is considering the production of a new style of collar
to attract young men. The follOWing statistics of neck circumferences are available
based upon measurements of a typical group of college students :  '
Midvalue No. of students Midvalue No. of students
(inches) (inches)
12.5 4 15.0 29
13.0 19 15.5 18
13.5 30 16.0 1
14.0 63 16.5 1
14.5 66
Compute the Standard Deviation and use the criterion (X ±3 Standard Deviation)
to determine the largest and smallest size of collars he should make in order to meet
the needs of practically all his customers, bearing in mind that collars are wom. on
~:age, ! inch larger than neck si.l.e. (D. Com., RRj., 1949).
loiHASUllES 01' DlSPERSION 231
, 40. Calculate the arithmetic tTerage and the standard deviation of the following
figures and state the percentages ot cases which He outside the mean at distance II ± (f,
'1I±2a, "±3a, where (1 stands for the atandard dCTiation.
148, f45, 141, 116, 96, $II, 87, 89, 91, 91, 102, 95, 108, 120, 139.
41. Find the S. D. of the following frequency distribution : 
Exceeding But not exceeding Frequency
5.5 6.5 4
6.5 7.5 2
7.5 8.5 5
8.5 9.5 7
9.5 10.5 9
10.5 11.5 4
11.5 12.5 2
(M. A., Agrll, ,1934).
42. The following table relates to the profits and losses of 100 firms. Calculate
the average profits and the standard deviation of profits.
Profits Rs. Number of £inns
5000 to 6000 8'
4000 to 5000 12
3OQo to ~OO 30
2000 to 3000 10
1000 to 2000 5
Oto rooo 5
1000 to 0 6
2000 to 1000 8
3000 to 2000 9
4000 to 3000 7
43. In any two series, where /1 and /. represent the deviation from a trial average,
100,
X/l 180 ,E/11=245320
XJ.250 ,EJ,I4385Q
II .... 100
Calculate the c:odfident of variation for the two series.
44. In any two aamplCl, wh~ the variatCi Xl and X. arc measured in the same
units,
"136 (summation) L'Xll=49428
",49 ., 2',..,1..,71258
Compute the values of the StandArd Deviations of the two samples. What
additional information is required to calculate the codficient of variation of the above
two samplCl? Indicate the uses of such a coefficient. (B. Ctmt., LIKIcn~. 43).
45. An analysis of the monthly wages paid to workers in two firms A and B,
belonging to the same indusny, gives the following results : 
Firm A Firm B
Number of ~eoarnetS 586 648
Ayerage monthl: w~ Rs. 52.5 47.5
Varian~ of the distnbution of wage 100 121
(II) Which firm, A or B, pays ~ut the larger amount as monthly wage. ?
(.) In which finn. A or B. is there sreater variability in individual wages ?
(&) What are the m~rCl of ('I aTctsge monthly wage, and (ii) the variability
ill individual waCCS. ot all the workers In the two .&nn•• A and B, taken toscthcr.
(1. A. S., ,11., ~",.,., 1951).
232 FPNDAMENTALS OF STA'l1ISllCS
46., The following table gives the marks obtained by 100 'itudents ;  
Digits (Division of Classinterval)
Marks 0 1 2 3 4 5 6 7 8 9 Total
09 2 4 3 1 1 1 12
101915 3 4 2 1 15
  2029 ! 
1 7 8 10 5 4 3 2 40
   3039 3 5 10 2 1 1 22
4049 4 3 2 2 11
100
By calculating the coefficient of variation in each case, find which team may
be considered more consistent. (I. A. S., 1954).
52. Explain the method of computing the standard deviation of a frequency
distribution from a working origin different. from the arithfIletical mean.
Calculate the standard deviation for the data given below using the interval,
5059 as working origin : 
Classinterval Frequency
0 9 2
10 19 4
20 29 23
30 39 30
40 49 40
50 59 45
60 69 35
7079 25
80 89 12
90 99 9
100109 6
110119 10
120129 3
130139 1
140149 1
150159 3
Total 249
How would the value obtained above be modified if. you have to adjust it for
the reason that the data are grouped in classintervals ? (r. A. S., 1956).
FUNDAMENTALS OF S.TATl:;U(';:S
53. The following is a record of the number of bricks laid each day fot 20 daya
by two bricklayers A and B :
A 725, 700, 750, 650, 675, 725, 675. 725, 625, 675,
700. 675. 725, 675, 800, 650, 675, 625, 700, 650,
B 575, 625, 600. 575, 675, 625, 575, 550. 650, 625,
550, 700, 625. 600, 625. 650, 575, 675, 625, 600.
Calculate the coefficient of variation in each case, and discuss the relatlYc consis
tency of the two bricklayers. If the figures for A were in every case 10 more and
those for B in every case 20 more than the figures given above. how would the ans
wer be affected ? (M. Com., BtmurIU. 1950).
54. A distribution consists of three components with frequencies of 200,
250 an? 300 having means of 25. 10 and 15 and standard deviations 0( 3. ",
and 5 respectively. Find the mean and the standard deviation of the combined
distribution ? (M. Com., B4narar. 1954).
55. Suppose each measurement in a distribution is multiplied by 2. What
happens to the : 
(it) mean of the distribution
(/I) variance" ..
(l») standard deviation of "
(J~ each of the three if .. is added to each meaSlUCment ?
56. Compute the values of arithmetic average, mode, median and standard
deviation for the following observations :
96, 8.... 10.3, 88, 92, 98, 100, 96, 87
92, 94.
57. Suppose a group of children have a distribution of I. Q. Scores with mean
100 and standard deviation 10. If one child with I.Q. 70 is reroOfed, what wllI be thc
c:fi"ect on the mean, and slllndard deTiation.
58. Three distributions each of 100 members and standard deviatlon 4.5 units
are loated with their arithmetic means at 12.1, 17.1 and 22.1 units respectively. Find
the standard deviation of the distribution obtained by combining the chCQI '1
S9 The (irst of the two samples bas 100 items with mean and standard deo:rla
tion ,: If the whole group has 250 items with mean 15.6 and standard deviatiOn
vIT44 find the standard deviation of the second group. \
. , (M. A., Beo., Ik/~/, 1~"91
60. The mean and the standard Deviation of a sample of to? observa.tlOfls waS
calculated a9 40 and 5. I respectively by a student Who took by mIstake '.0 mstcad of
40 for one observation. Calculate the correct mcan and standard deYlallon.
61. Coefficient of variation of two series are 60% and 70%. Their standard
deviatjons are :z [ {lnd z6. What are their arithmetic means?
62. Given: Number Mean Variance
IG~ ~
11 Group 60 5'
1 and II Group combincd 95 u .,
Find the missing items.
63. Indicate the extent of dispersion graphically for the data giycn in the
follOWing table ; 
Years
Income (in thousands)
I '.><
AII
B1
6
t6
55
8
ao
,6
JJ
(8
57
9
18
58
8
ZO
59
10
Z2
60 61
12
36
10
18
6z 6_
J4
zz
u.
110
64. The tablc given below gives the population and weekly earnings of twO
localitiesA and B. Represent the data graphically to bring out the inequalities
of dil;tribution of earnings.
MBASUB,ES OF DISPERSION 235
Weekly earning
(in Rs. I
o:to I 2
2040 6 S
4000 8 zo
6080 IS zS
SoIOO 20 4~
65. Find the actual classintervals from the data given below :
dx 3 2 I 0 1 :l S
f 10 15 25 25 10 10 5
,~ n n n
Second Moments about the value x. It is obvious that any moment about
II value other than mean, would be more than the value of the moment
about the mean. Thus the first moment about the mean is 0 because the
sum of the deviations from the mean isalways o. The second moment
about the mean is the variance or the square of the standard deviation.
Just as we can calculate the first and second moments e\ther about the
mean or about any other value similarly 3rd, 4th, 5th and nth moments
can be calculated either about the mean or about any other value of the
variate. Thus the third moment about the mean or
};fd3
7t3= n and
8 I S
rIO
t>.. o tl 0 a<'I
E u ..!;:lpo."'" J:j
o~
q ........ ",«~
*~  :g S'" 2
~
0 ...... q 0 .._.. co
......
0 &.'' .;:: So II) U ~
.:::... ~ ~
0 ...... ..... d
..... <'I
fd 2 d3 fd 3
bl)
........ ~ ~
~
~ :J
4.) •21 :J <'I • ..4 U .....
~
N
po. '"
V <'I
U5 ~<'I
rIO
0
.2 10 2 .20 '40 80 HS 11.22 112.2 37. 6 376 .0
4 IS 0 0 0 0 l.~S 1.82 27·3 ~·46  36 '9
8 8 4 32 u8 p2 +2.65 7. 02 56 . 1 10.6 148.8
10 7 6 42 252 151 2 +4·65 21.62 15 1.3 100·7 70 4.9
Total 4 0 +54 4 20 1944 346 ,9 440.8
The first moment about the arbitrary origin (4) or
Efdx 54
vl =    = =1·35
n 40
The first moment about the mean or
, S4 54
7tl=v1V1 =   =0
40 40
The second moment about the arbitrary origin (4) or,
Efdx2 420
Vz =   = =IO·S
n 40
The second moment about mean or
7tZ=V2V12 = 420
___ •.
r(
:(5 4)2
" •
0
~ 86
 . 8
40 4
The third mOment aDout the arbitrary origin (4) or
Efdx3 1944
vs=   =  = 4 8. 6
n 40
The third moment about the mean or
1Ta=Pa3PtV2+2.V13
= 1944
40
(; X 54 X">42.0) +2 X
4 0 X40
(Jj;)3
4,0
=48.6 42.5 +4.9 2
= 1 1.02
and further
1'1 = +v'i3;
and
1'2=~23
Thus for example No. I.
(11"02.)2 _J(I1.02.)'
~l = (8.68)3 and 1'1  (8.68)
Need and lneanin.g. In our studies so far, we have discussed the methods
of measuring the central tendency of a frequency distribution and the
methods of studying the concentration of items ro'und the central value.
These measures of central tendency and disperSion do not reveal whether
the dispersal of values on either side of an average is symmetrical or not.
If observations are arranged in a symmetrical order round a measure of
central tendency, we get what is called a "symmetrical distribution."
When plotted on a graph paper such a distribution gives a normal or
ideal curve. A normal curve has many mathematical properties, which
we shall study in a later chapter in which we shall discuss the various types
of theoretical frequency distributions. For the present it would suffice
to say that in a normal distribution the values of the mean, median and
mode coincide and the quartiles are equidistant of the median. It is obvious
that in such cases the sum of the deviations measured from the mean,
median or mode would be o. We have already mentioned in earlier
chapters that the empirical relationships between various averages and
measures of dispersion hold good only in a symmetrical distribution.
MOMENTS. SKEWNESS AND KURTOSIS 2.39.
( ~.
/ '
\
J ~
V
/ \\
L/ a "
M
Z
Figure I.
( " ~
1\
I f\
II '\
~
./ Z Md
~
Figure z.
Figure No. 3 also gives the shape of a moderately skew curve.
This curve is skewed to the left and in it, the value of mode would be
greater than the value of median and the value of median would be greater
than the value of the mean. Such curves are called negatively skew.
V
I \
/
1
"\ ,
~
/ I ~
(1M Z
Figure ;.
UOlGlNTS. lOWNESS AND s:uaTOSIS 241
T_t oIl11ewaea
In order to find Qut whether a particular distribution is ,Ikew cer
tain testa are u~a1ly applied. They ale as followa :
(.) In a lkew distribution val,ues of mean. median and mode
would not coincide. The ttlean and mode would be pulled wide apart
and median would usuilly lie between them. Vie have already seen
that· in modetate1y asymmetrical distribution ;
Mean =Modc+ I (MedianMode)
(j) In a Ikcw distribution the two qual' tiles would not, be equi
distant from the median or in other words (12, M)(M 121) would
not be O.
(e) A skew distribution when plotted on a graph paper would not
gi'Ye a .ymmetrl~ bellshaped curve.
Mouurel ollkewnel'
The abo..e mentioned 'testl would indicate whether a particular
distribution ia skew or not. If a particular diltdbution is (ound to be
skew the nat problem that arises is to meu~re the c::ct~t of skewness.
Some distributions may be slightly dUfctent from th;' ~'!trical dis
tribution while others may be very much different fro~ ~~,. Meuures
of skewness are meant to give an idea about the extent "01 asymmetry
in a series. . " ,
First IIIUlllrll of SIu1ll1lIlS. 'Pte 'first meaSures of skewness are
based on the assumption that in a skew distribution the values of mean,
median and mode do not coincide. This being so. the difference
between any two of these values indicates the extent of skewness.
Thus fint measures of skewness ate :
(I') Mean  Mode or (11 Z)
(it) M~Median or (.M)
(iiI') MedianMode 01' (MZ)
The above measures of skewness arc absolute measures. For pur
poses of comparison it is necessary to have telative meaaurea of .Itew
neS!. Relative measures of skewness are obtalined by dividing the
absolute measures byuny measures of diapetaion. The absolute measures
of .kewnes. should not be divided by a mCUUt'e of central tendency or
average because. here the problem il not to study the extent of skewness
in s:elation to the size ofitem&, but it is to study the asymmetry in relation
,J to the di.~raal of items round a central value. The purpolle of studying
skewnes'1' to find out how much more or leis. do the items on one side
deviate.from the items on the other side of a central value. Therefore,
absolute measures of skewness IhQ~l~diVjded b1 a measure of disper
sion rather than a measure of ce(it.r t\ndency. Relative measures of
.kewness .lIe known o,..'/fid,,,f bf ~ »IfI.us.
16
242 FUNDAMENTALS OF STATISTICS
Thus
Coefficient of skewness or
· aZ (i)
J=sz····
· aZ
or J= a· ..··· (it)
If mode is illdefined median can be used in place of mode and then
• ,(1 M (,'1.'.)
J=an; ..... .
· aM
or J=_ 8······ (iv)
./=sz
· MZ
..... . (v)
· MZ
or J= sm (vi)
Kllrl Pearson has given a formula in which the denominator ,is not
the mean deviation but standard deviation.
· aZ (vii)
Thus J=  ..•...
a
\
If mode is illdefined, Karl Pearson is of opinion 'that its value
should be estimated on the basis of the empiri~l relationship which
exists between the values of mean, median and mode in a moderately
asymmetrical distribution. We have seen that in a moderately asym
metrical distribution
(MeanMode) = 3 (MeanMedian)
Thus j = 3(aM) .... (viii)
a
The value of the above coefficients of skewness would be 0 for a
symmetrical distribution and for skew distributions it would be a pure
number. These are the two properties of these coefficients and for these
reasons they are regarded as better than other tneasures. In theory there
are no limits to the values of the coefficient numbers (i), (ii), (iii), (iv),
(v), (111} and (vii). In actual practice for moderately asymmetrical distri
butions all these coefficients (excepting No. viii) vary between ± 1. The
theoretical limits of coefficient number (viii) are ±3 (because the
aM .
theoreticaI limits 0 f   are ±1) but they are never reached In actua
I
u
practice.
MOMENTS, SKEWNESS AND KUR'l:OSIS 243
SuonJ Measure oj Skewnes.r. The second measure of skewness is
based on the quartiles. It has been said above that in a skewed distri
bution (M Ql) and (QaM) would not be equal. A measure of skew
ness is thus derived by finding out the difference between these
two values.
Thus
Second measure of skewness =(QaM)(MQl)
=Qa 2M+Ql
=Q.+Q,2M
The above is an absolute measure of skewness. The relative
measure can be obtained by dividing this absolute measure by the sum
of (QaM) and (MQJ.
Thus the coefficient of skewness or
;_ (QsM)(M Ql)
 (Q.M) + (M Ql)
QS+Ql 2M
= QQ .... (ix)
3 1
Jtt1ll TOSIS
Figure 4.
baurea of kurtosis
Kurtosis is measured by coefficient' f3. or its derivatio.r )'1' We
lave seen in connection with th e ltudy of moments that
~.
Q "'"
== .
"".
In other words P. is equal to the fourth moment about the mean
lirided by the square of the second moment about the mean.
Y. = P.  3
The standard value of fl. is taken SUI 3 and the CUtVC8 with valuei
f II. less than 3 are called ~latykurtic and curves with values of P. morc
lao 3 are called leptokurtlc. In a normal or metokurtlc curve the value
flJ. is equal to 3. .As sudl for a normal curve the value of Y. 0, and
I curves which are more Battopped o·r more peaked than the nonna}
nve the value of y. would be cithet a minus or pInl iigure. The
igge.r the value
!!parture from no
c;!,!j1 in sa frcqueru:y dittributiOD. the greater is its
ty.
iapeaion, .xccwnes. IRIld kurto8ia contrasted
Now that we have .tudicd CIi.apetsion, Ikewocsi and kurtOllis, it
·ill not be out of place to comparc1Ulcfcontralt them,llI all the.e meuurcs
:e meant to study the formation of a frequency distribution;Disper.ion
:udies the acatter of itcml unmd a central value or among themaelTcl.
: doa not ahowthe extent to which deviations dulter below an QeDlle
246 FUNDAMENTALS OF S'rATISTICS
::>r above it. Measures of skewness study this point. ,They tell us .about
the cluste!= of deviations above and below a measure of central tendency.
In a normal distribution the deviations below and above an average are
equal while in an asymmetrical distribution they are not equal. Kurtosis
studies the concentration of items at the central part of a series. If the
items concentrate too much in centre the curve becomes leptokurtic.
and if the concentration in the centre is comparatively little the curve
becomes platykurtic.
Thus we find that measures of dispersion, skewness and kurtosis
study three different aspects of a frequency distribution. Measures of
dispersion throw light on the span withil;l which values of a variable lie.
They study the size of a series. Measures of skewness throw light on
the shape of the series and the size of variation on either side of a central
value. Kurtosis studies the frequencies of' a series at the cent.ral values.
The theory of skewness and kurtosis has not a very great impor
tance in economic and social studies, as in these cases a normal distri
bution is usually out of question, but the importance of these studies is
very great in biological studies and studies relating to other physical
sciences.
Questions
I. Define moments and discuss the method of calculating momcllts of dja
persion about the mean.
I
2.. How would you calculate the value of a moment about the mean from the
value of the moment about any arbitrary value ?
~ . What is skewness? How does it differ from dispersion? What arc the
vadous measures of skewness which you know ?
4' What ia kurtosis? What purpose does it serve? 1& the ltudy of kurtosil
useful in economic and social scieoces ? If oot. why ?
5. Find the Second Moment of I;>ispersioo and a coefficient of skewness from
the data in the following series : 
Size of item Frequl;ncy Size of item Frequency
3 7·5 8S
7 8·S 32
za 9·S 8
60
61. Find out the mean wage and a coe6icien't of skewness for the following :_
3~
40 ..,.
men get at the rate of Rs.
..
•• ,.••
55 0 ....
450
Molto più che documenti.
Scopri tutto ciò che Scribd ha da offrire, inclusi libri e audiolibri dei maggiori editori.
Annulla in qualsiasi momento.