Table of Contents
CHAPTER 1 RESEARCH FUNDAMENTALS AND TERMINOLOGY 14
DEFINITIONS OF BASIC RESEARCH TERMS ......................................... 15
Research ..................................................................................................................... 15
Research Methods ...................................................................................................... 15
Research Methodology ............................................................................................... 17
Scientific Methods ...................................................................................................... 17
Research Process ........................................................................................................ 19
Research Design ......................................................................................................... 19
OBJECTIVES OF RESEARCH ....................................................................... 20
MOTIVATION IN RESEARCH ...................................................................... 21
SIGNFICANCE OF RESEARCH IN GOVERNMENT, INDUSTRY
BUSINESS AND TRADE ................................................................................ 23
SCOPE OF RESEARCH INCLUDES THE FOLLOWS AREAS ................. 28
PRINCIPLES OF QUALITY RESEARCH WORK ....................................... 29
PROBLEMS/LIMITATIONS OF RESEARCH ............................................ 31
ISSUES AND TRENDS IN RESEARCH ........................................................ 33
SUMMARY ....................................................................................................... 35
REVIEW EXERCISES ..................................................................................... 35
FURTHER READINGS ................................................................................... 37
CHAPTER 2 IMPORTANCE OF RESEARCH IN MANAGEMENT
DECISIONS 38
FUNDAMENTALS OF MANAGEMENT DECISIONS ............................... 39
Characteristics of Management Decisions .................................................................. 39
Elements of Decision Making ...................................................................................... 40
TYPES OF MANAGEMENT DECISIONS .................................................... 44
Planning Decisions on Time Horizons ......................................................................... 45
Static Planning Decisions ............................................................................................ 47
Dynamic Planning Decisions ....................................................................................... 47
Planning under Dynamic Conditions ........................................................................... 48
Planning intangible Decisions ..................................................................................... 50
Control Decisions ........................................................................................................ 51
Programmed and Nonprogrammed Decisions ........................................................... 52
Routine and Strategic Decisions ................................................................................. 52
Policy and Strategic Decisions .................................................................................... 52
Departmental and NonEconomic Decisions ............................................................... 52
Organizational and Personal decisions ....................................................................... 53
IMPORTANCE OF RESEARCH IN MANAGEMENT DECISIONS .......... 55
Research and Corporate Strategy ............................................................................... 55
Research Based Management Decisions for Positioning in industry ........................... 57
Management Areas of Decision Making ..................................................................... 58
SCOPE OF RESEARCH IN MANAGEMENT DECISIONS ........................ 61
Types of Research and Management Decisions .......................................................... 61
LIMITATIONS OF RESEARCH IN MANAGEMENT DECISIONS .......... 64
ISSUES IN DECISION MAKING THROUGH RESEARCH ....................... 67
STEPS IN DECISION MAKING THROUGH RESEARCH ........................ 68
Procedure of Decision Making .................................................................................... 68
OPERATION RESEARCH TECHNIQUES AND METHODS APPLIED TO
MANAGEMENT DECISIONS ........................................................................ 73
Operation Research Tool Box ...................................................................................... 75
NEWER TRENDS IN DECISION MAKING THROUGH RESEARCH .... 79
SUMMARY........................................................................................................ 82
REVIEW QUESTIONS .................................................................................... 82
FURTHER READINGS ................................................................................... 83
CHAPTER 3 ...................................................................................................... 84
DEFINING RESEARCH PROBLEMS AND FORMULATION OF
HYPOTHESIS................................................................................................... 84
RESEARCH PROBLEM DEFINED ............................................................... 85
NECESSITY OF DEFINING A RESEARCH PROBLEM ............................ 85
FACTORS TO BE CONSIDERED WHILE DEFINING A RESEARCH
PROBLEM ........................................................................................................ 86
PROCEDURE OF DEFINING A RESEARCH PROBLEM ......................... 88
PREREQUISITES OF DEFINING A RESEARCH PROBLEM ................ 90
HYPOTHESIS AND RELATED TERMS ...................................................... 91
Definition .................................................................................................................... 91
Importance of Hypothesis ........................................................................................... 91
Origin of Hypothesis/Sources of Hypothesis ............................................................... 92
Key of Hypothesis ....................................................................................................... 94
DIFFICULTIES IN FORMULATION OF HYPOTHESIS ........................... 96
STEPS IN TESTING THE HYPOTHESIS .................................................... 97
CHARACTERISTICS OF A GOOD HYPOTHESIS ..................................... 99
CONFIRMATION OF HYPOTHESIS ........................................................ 100
SUMMARY..................................................................................................... 101
REVIEW QUESTIONS ................................................................................. 102
FURTHER READINGS ................................................................................ 103
CHAPTER 4 RESEARCH DESIGN ........................................................ 104
FUNDAMENTALS OF A RESEARCH DESIGN ....................................... 105
Research Design Defined ........................................................................................... 105
Elements of a Research Design .................................................................................. 105
Dependent and Independent Variable ...................................................................... 106
Extraneous Variable .................................................................................................. 106
Control ...................................................................................................................... 107
Confounded Relationship .......................................................................................... 107
Experimental and Nonexperimental Hypothesistesting Research. .......................... 107
Experimental and Control Groups ............................................................................. 108
Treatments ............................................................................................................... 108
Experiments ............................................................................................................. 108
Experimental units ................................................................................................... 109
SIGNIFICANCE OF A RESEARCH DESIGN ............................................ 110
FEATURES OF A GOOD RESEARCH DESIGN ....................................... 111
STEPS IN A RESEARCH DESIGN ............................................................. 112
OBJECTIVITY IN A RESEARCH DESIGN ............................................... 115
Concepts of objectivity ............................................................................................. 115
Need of Objectivity ................................................................................................... 115
Means of Objectivity or Methods of Achieving Objectivity ....................................... 119
Sources of Prejudices and Biases .............................................................................. 123
TYPES/FORMS OF A RESEARCH DESIGN ........................................... 126
Exploratory or Formulative Design ........................................................................... 126
Characteristics of Exploratory Design ....................................................................... 127
Role/Significance of Exploratory Design ................................................................... 128
Methods of Exploratory Research Design ................................................................. 130
Factors affecting the analysis of insight stimulating cases ........................................ 131
Motivating Events for Investigators .......................................................................... 132
Descriptive and Diagnostic Design ............................................................................ 134
Difference between Descriptive and Diagnostic Designs .......................................... 135
Experimental Designs ............................................................................................... 137
SUMMARY .................................................................................................... 138
REVIEW QUESTIONS ................................................................................. 139
FURTHER READINGS ................................................................................ 140
CHAPTER 5 EXPERIMENTAL DESIGN ............................................. 141
FUNDAMENTALS OF EXPERIMENTAL DESIGN ................................ 142
Experiment ............................................................................................................... 142
Experiment Design ................................................................................................... 142
Factor ....................................................................................................................... 142
Level ......................................................................................................................... 143
Treatment ................................................................................................................ 143
Experimental Unit .................................................................................................... 143
Response .................................................................................................................. 143
Effect ........................................................................................................................ 143
Main Effect ............................................................................................................... 144
Interaction ................................................................................................................ 144
Randomization ......................................................................................................... 144
Replication ............................................................................................................... 144
Experimental Error ................................................................................................... 144
NEED FOR EXPERIMENTAL DESIGN .................................................... 145
BASIC PRINCIPLES OF EXPERIMENTAL DESIGN ............................. 147
Application of Randomization of treatments ............................................................ 148
Randomization of treatments ................................................................................... 149
STEPS IN PLANNING AN EXPERIMENTAL DESIGN .......................... 151
IMPORTANT EXPERIMENTAL DESIGNS ............................................. 155
Informal experimental design ................................................................................... 156
Afteronly with control design .................................................................................. 157
Beforeandafter with control design ........................................................................ 158
Formal experimental designs .................................................................................... 158
Advantages of completely randomized design .......................................................... 160
Randomized Block Design ......................................................................................... 160
Advantages of completely randomized experimental design .................................... 162
Latin Square Experimental Design ............................................................................. 167
Importance of Latin Square Design ........................................................................... 170
Major Assumptions in Analysis of Latin Square ......................................................... 171
Steps in construction of Latin Square Design ............................................................. 172
Analysis of Variance Table......................................................................................... 173
Randomised Blocks Visvis Latin Square ................................................................. 174
Extension to Latin Cubes ........................................................................................... 174
Factorial Experiment ................................................................................................. 174
DIFFICULTIES IN EXPERIMENTAL DESIGNS ..................................... 176
SUMMARY..................................................................................................... 177
REVIEW EXERCISES .................................................................................. 177
FURTHER READINGS ................................................................................ 177
CHAPTER 6 METHODS AND TECHNIQUES OF DATA
COLLECTION 180
DATA DEFINED ........................................................................................... 181
CHARACTERISTICS OF DATA ................................................................. 181
PRIMARY DATA .......................................................................................... 184
Advantage ................................................................................................................. 184
SECONDARY DATA .................................................................................... 185
Advantage ................................................................................................................. 186
Disadvantage ............................................................................................................ 186
DISTINCTION BETWEEN PRIMARYAND SECONDARY DATA ...... 188
VERIFICATION OF SECONDARY DATA ................................................ 189
CHARACTERISTICS OF SECONDARY DATA ....................................... 190
SOURCES OF SECONDARY DATA ........................................................... 191
METHODS OF DATA COLLECTION ........................................................ 192
Observation Method ................................................................................................. 192
Characteristics of Observation Method ..................................................................... 193
Merits of Observation Method.................................................................................. 194
Limitations of Observations Method ......................................................................... 195
Interview Method ..................................................................................................... 197
Characteristics of Interview Method ......................................................................... 197
Merits of Interview method ...................................................................................... 197
Limitations of Interview Method ............................................................................... 199
Types of Interviews ................................................................................................... 201
Means of Eliciting Correct Responses in an Interview ............................................... 204
Questionnaire Method .............................................................................................. 206
Types of Questionnaire ............................................................................................ 206
Procedure of Organization of Research through Questionnaire ................................ 207
Advantage of Questionnaire Method ....................................................................... 209
Disadvantage of Questionnaire Method ................................................................... 211
Construction of Questionnaire ................................................................................. 213
Steps in Construction of Questionnaire .................................................................... 213
Precautions in the Construction of a Questionnaire ................................................. 215
Pretesting of Response in Questionnaire ................................................................. 216
Problem of Response ................................................................................................ 217
Using Schedule Method for Data Collection ............................................................. 220
Purposes/Objectives of the schedule ....................................................................... 220
Types of Schedules ................................................................................................... 221
Characteristics of a Good Schedule ........................................................................... 222
Suitability of Schedule Method ................................................................................ 223
Limitations of Schedule Method ............................................................................... 224
Distinction between Schedule and Questionnaire .................................................... 225
Questions to be Included in the schedule ................................................................. 228
Organization of schedule .......................................................................................... 229
Documented Sources of Data ................................................................................... 231
Personal documents ................................................................................................. 232
Public Documents ..................................................................................................... 233
State of Documented Data in India ........................................................................... 234
Case Study Method .................................................................................................. 237
Procedure of Case Study ........................................................................................... 242
SUMMARY .................................................................................................... 243
REVIEW QUESTIONS ................................................................................. 244
FURTHER READINGS ................................................................................ 246
CHAPTER 7 SAMPLING AND SAMPLING DESIGN ........................ 247
BASIC DEFINITIONS .................................................................................. 248
Sampling................................................................................................................... 248
Universe/Population ................................................................................................ 248
Sample ..................................................................................................................... 249
Complete Enumeration or Census ............................................................................ 249
Sampling Frame ........................................................................................................ 250
Sampling Design ....................................................................................................... 250
Statistics and Parameters ......................................................................................... 250
Sampling Errors ........................................................................................................ 250
Precision ................................................................................................................... 250
Confidence Level and Significance Level ................................................................... 251
Sampling Distribution ............................................................................................... 251
Difference between Population and Census ............................................................. 253
LAWS OF SAMPLING ................................................................................. 253
Law of Statistical Regularity ..................................................................................... 253
Reliability of the Law ................................................................................................ 254
Characteristics of the Law ......................................................................................... 254
Limitations of the Law .............................................................................................. 255
Utility of the Law ....................................................................................................... 255
Law of Inertia of Large Numbers ............................................................................... 256
THEORY OF SAMPLING ............................................................................ 258
SCOPE OF CENSUS METHOD OF DATA COLLECTION ..................... 259
Suitability of Census Method .................................................................................... 259
SCOPE OF SAMPLING METHOD OF DATA COLLECTION ............... 260
Features of Sampling Method ................................................................................... 261
Limitations of Sampling ............................................................................................. 262
Characteristics of Ideal Sample ................................................................................. 263
METHODS OF SAMPLING......................................................................... 264
Probability Sampling Methods .................................................................................. 264
Simple Random Sampling .......................................................................................... 265
Stratified Sampling Method ...................................................................................... 270
Types of Stratified Sampling: ..................................................................................... 272
Systematic Sampling ................................................................................................. 274
Sampling Interval/Ration .......................................................................................... 275
Advantage of Systematic Sampling ........................................................................... 275
Disadvantage of Systematic Sampling ....................................................................... 276
Multistage Sampling ................................................................................................. 276
Brief Description of Staging Process .......................................................................... 276
Merits of Multistage Sampling: ................................................................................. 277
Demerits of Multistage Sampling .............................................................................. 278
Nonprobability Sampling Methods .......................................................................... 278
Judgment Sampling ................................................................................................... 279
Convenience Sampling .............................................................................................. 280
Quota Sampling ........................................................................................................ 280
Merits of Quota Sampling ......................................................................................... 281
Demerits of Quota Sampling ..................................................................................... 281
RELIABILITY OF THE SAMPLING .......................................................... 282
SIZE OF SAMPLE ......................................................................................... 283
Factors to be Considered in Sample Size ................................................................... 283
DETERMINATION OF SAMPLE SIZE ..................................................... 285
SAMPLING AND NON SAMPLING ERRORS ......................................... 288
Sampling Errors ......................................................................................................... 288
NonSampling Errors ................................................................................................. 289
SUMMARY..................................................................................................... 290
REVIEW QUESTIONS ................................................................................. 291
FURTHER READINGS ................................................................................ 292
CHAPTER 8 ATTITUDE MEASUREMENT AND SCALES .............. 293
ATTITUDE DEFINED ................................................................................. 294
Characteristics of Attitude ........................................................................................ 294
IMPORTANCE OF THE STUDY OF ATTITUDE ................................... 295
MEASUREMENT OF ATTITUDES ........................................................... 298
DIFFICULTIES IN MEASUREMENT OFATTITUDES .......................... 299
CONCEPT OF SCALE ................................................................................... 301
SIGNIFICANCE OF SCALING..................................................................... 304
Basis for Scale Classification ..................................................................................... 305
Stages of Development of Scales .............................................................................. 309
Scale Construction .................................................................................................... 311
Problems in Construction of Scales ........................................................................... 312
ATTITUDE SCALES .................................................................................... 317
Attitude Measuring Scales ........................................................................................ 317
SOURCES OF ERRORS IN MEASUREMENT OF ALTITUDES USING
SCALING TECHNIQUES ............................................................................. 325
SUMMARY .................................................................................................... 326
REVIEW QUESTIONS ................................................................................. 326
FURTHER READINGS ................................................................................ 327
CHAPTER 9 DATA PROCESSING ........................................................ 329
BASICS OF DATA MANIPULATION ....................................................... 330
Elements of Data Processing..................................................................................... 330
DATA EDITING ............................................................................................ 331
Stages in Editing ....................................................................................................... 333
Precautions in Editing ............................................................................................... 334
CODIFICATION OF DATA ......................................................................... 335
CLASSIFICATION OF DATA ..................................................................... 336
Objectives of Classification of Data .......................................................................... 336
TABULATION OF DATA ............................................................................ 346
Tabulation is the Final Stage in Data processing ....................................................... 346
Objectives, Importance and Advantages of Tabulation ............................................ 347
Differences between Tabulation and Classification .................................................. 348
Constituents of a Table ............................................................................................. 348
Types of Statistical Tables ......................................................................................... 350
SOLVED PROBLEMS IN DATA PROCESSING ...................................... 354
SUMMARY .................................................................................................... 366
REVIEW QUESTIONS ................................................................................. 367
FURTHER READINGS ................................................................................ 368
CHAPTER 10 STATISTICAL ANALYSIS AND INTERPRETATION
OF DATA:NON PARAMETRIC TESTS .................................................. 369
DEFINITION OF NONPARAMETRIC TESTS ....................................... 370
ADVANTAGES OF NON PARAMETRIC TESTS .................................... 370
LIMITATION OF NON PARAMETRIC TESTS ....................................... 371
LISTING OF NON PARAMETRIC METHODS ....................................... 372
THE SIGN TEST ........................................................................................... 373
Application ............................................................................................................... 374
Advantages of Sign Tests: ......................................................................................... 374
Limitations of Sign Tests ........................................................................................... 375
A RANK SUM TEST ..................................................................................... 375
KRUSKAL WALLIS OR H TEST ............................................................ 377
ONE SAMPLE RUN TEST ........................................................................... 377
KOLMOGOROV SMIRNOV TEST ......................................................... 379
ADVANTAGE ................................................................................................ 379
SPEARMANS RANK CORRELATION ..................................................... 380
Advantages ............................................................................................................... 381
Limitations ................................................................................................................ 381
SOLVED EXAMPLES.................................................................................................... 382
SUMMARY..................................................................................................... 405
REVIEW QUESTIONS ................................................................................. 405
FURTHER READINGS ................................................................................ 406
CHAPTER 11 MULTIVARIATE ANALYSIS OF DATA ..................... 407
DEFINITION OF MULTIVARIATE ANALYSIS ...................................... 408
OBJECTIVE OF MULTIVARIATE ANALYSIS ........................................ 409
ADVANTAGE OF MULTIVARIATE ANALYSIS ..................................... 409
DISADVANTAGES OF MULTIVARIATE ANALYIS .............................. 409
APPLICATIONS OF MULTIVARIATE ANALYSIS ................................ 410
CLASSIFICATION OF MULTIVARIATE TECHNIQUES ...................... 411
Multiple Regression .................................................................................................. 411
Canonical Correlation Analysis .................................................................................. 423
Factor Analysis .......................................................................................................... 434
Methods of Factors Analysis ..................................................................................... 437
Principal Components Method .................................................................................. 443
Maximum Likelihood Method ................................................................................... 457
Multivariate Analysis of Variance .............................................................................. 458
ILLUSTRATION ON CLUSTER ANALYSIS ............................................ 461
APPLICATION AREA OF LSA ................................................................... 226
SUMMARY..................................................................................................... 231
REVIEW QUESTIONS ................................................................................. 231
CHAPTER 12 MODEL BUILDING AND DECISION MAKING ........... 236
FUNDAMENTALS OF MODEL BUILDING ............................................. 237
Advantages of a Model ............................................................................................. 237
Disadvantages of a Model ......................................................................................... 238
Characteristics of Good Models ................................................................................ 238
TYPES OF MODELS .................................................................................... 239
Physical Models ........................................................................................................ 239
Iconic models ............................................................................................................ 239
Analogue models ...................................................................................................... 239
Symbolic Models ....................................................................................................... 242
METHODS OF BUILDING MODELS ........................................................ 244
Advantages of Monte Carlo methods ........................................................................ 246
Disadvantages of Monte Carlo Methods ................................................................... 246
PHASES OF MODEL CONSTRUCTION ................................................... 250
MODEL BUILDING AND DECISION MAKING ...................................... 252
PHASES AND PROCESSES OF MODELLING ........................................ 254
Various Types of Models .......................................................................................... 256
Some famous models of operations research ........................................................... 257
APPLICATION OF MODEL BUILDING IN DECISION MAKING ....... 261
Applications of Dynamic Programming ..................................................................... 261
Characteristics of Dynamic Programming Problem ................................................... 262
APPLICATION OF QUEUING THEORY .................................................. 263
APPLICATION OF SIMULATION ............................................................ 269
Identification of Proper simultaneous Equation Model ............................................ 281
Concept of Identification .......................................................................................... 282
Rules for Identification ............................................................................................. 283
Applying Order Conditions to the Model .................................................................. 285
The Rank Condition .................................................................................................. 286
SUMMARY .................................................................................................... 287
REVIEW QUESTIONS ................................................................................. 288
FURTHER READING .................................................................................. 288
BLOCK 4 REPORT WRITING AND PRESENTATION ........................ 289
CHAPTER 13 SUBSTANCE OF REPORTS ......................................... 290
DEFINITION OF REPORT ......................................................................... 291
PURPOSE OF THE RESEARCH REPORT............................................... 292
REPORT SYNOPSIS .................................................................................... 294
TYPES OF REPORTS .................................................................................. 296
A. Information Report ...................................................................................... 297
B. Decision Reports .............................................................................................. 298
C. Research Reports ............................................................................................ 301
E. Popular Report ..................................................................................................... 304
CHARACTERISTICS OF A GOOD REPORT ........................................... 305
STRUCTURE OF A GOOD RESEARCH REPORT .................................. 307
SUMMARY .................................................................................................... 311
REVIEW EXERCISES .................................................................................. 311
FURTHER READINGS ................................................................................ 312
CHAPTER 14 WRITING AND FORMATTING OF REPORTS ........ 313
PRINCIPLES OF DRAFTING A RESEARCH REPORT ......................... 314
IN REPORT WRITING ................................................................................ 316
LAYOUT OF RESEARCH REPORTS ........................................................ 319
COVER AND TITLE PAGE FORMATS .................................................... 324
INTRODUCTORY PAGES / SECTION OF THE REPORT ................... 327
MAIN TEXT OF THE REPORT ................................................................. 333
REFERENCE SECTION AND ITS STYLING ............................................ 343
SUMMARY .................................................................................................... 348
FURTHER READINGS ................................................................................ 350
CHAPTER 15 PRESENTATION OF A REPORT ................................ 351
NEED FOR REPORT PRESENTATION ................................................... 352
OBJECTIVES OF A GOOD PRESENTATION .......................................... 352
PRESENTATION OF RESEARCH REPORT TO TECHNICAL PERSONS
......................................................................................................................... 352
PRESENTATION OF RESEARCH REPORT TO A LAYMAN .............. 354
PRESENTATION SKILLS SET ................................................................... 355
SUMMARY..................................................................................................... 359
REVIEW EXERCISES .................................................................................. 360
FURTHER READINGS ................................................................................ 361
Part 1
INTRODUCTION TO RESEARCH METHODOLOGY
CHAPTER 1 RESEARCH FUNDAMENTALS AND
TERMINOLOGY
CHAPTER 2 IMPORTANCE OF RESEARCH IN
MANAGEMENT DECISIONS
CHAPTER 3 DEFINING RESEARCH PROBLEMS AND
FORMULATION OF HYPOTHESIS
CHAPTER 4 RESEARCH DESIGN
CHAPTER 5 EXPERIMENTAL DESIGN
INTRODUCTION TO RESEARCH METHODOLOGY
This Part consists of five Chapters, dealing with the introduction
and basic concepts about research methodology.
The first Chapter gives overview of research fundamentals and
concepts such as objectives, scope, trends of research. It also
dwells on objective motivation and signification of research in
government, industry, business and trade.
The second Chapter highlights the importance of the research in
management decisions including the types of decisions and how
various issues in decision making can be resolved through
research.
Chapter three looks into the issues of defining research
problems and hypothesis formulation, including characteristics
of good hypothesis and its testing.
Chapter four provides insights into research design including the
various steps that are needed for a good research design
together with various types of research design.
The last Chapter of this Part explores the concepts of
experimental design. It highlights various experimental designs
like randomized Part, latin square design and factorial design.
CHAPTER 1 RESEARCH FUNDAMENTALS
AND TERMINOLOGY
Objectives
After reading this Chapter you should be able to
 Understand basic research terms and objectives
and motivation behind carrying out research
 Appreciate significance of research in various
aspects of business, trade and government
 Delineate scope and quality aspect of research
 Understand limitations and boundaries of research
along with latest trends in research
Structure
 Definitions of basic research terms
 Objectives of research
 Motivation in research
 Significance of research in government, industry
business and trade
 Scope of research
 Principles of quality research work
 Problems limitations of research
 Issues and trends in research
 Summary
 Review Questions
 Further Readings
DEFINITIONS OF BASIC RESEARCH TERMS
RESEARCH
Research may be defined as a systematic approach/method
consisting of enunciating the problem, formulating a hypothesis,
collecting the facts or data, analyzing the facts and reaching
certain conclusions, either in the form of solution towards the
concerned problem or in certain generalizations for some
theoretical formulation.
Research may also be defined as a scientific study, which by
means of logical and systematized techniques, aims to...
a. Discover new facts or verify and test old facts
b. Analyze their sequences, interrelationships and
explanations which are derived within an appropriate
theoretical frame of reference
c. Develop new scientific tools, concepts, and theories which
would facilitate reliable and valid study of human
behavior in decision making.
RESEARCH METHODS
The Research methods may be defined as all those
methods/techniques that are used for conducting the research.
Research methods can be put into the following three groups.
a. In the first group, we include those methods which are
concerned with the collection of data, these methods will
be used where the data already available are not
sufficient to arrive at the required solution.
b. The second group consists of those Statistical Techniques
which are used for establishing relationships between
variables.
c. The third group consists of those methods which are used
to evaluate the accuracy of the results obtained.
Research methods refer to the behavior and instruments used in
selecting and constructing research techniques.
Activity 1.1
(i) What are the fine line difference between discovery,
invention and research?
(ii) What is the difference between (a) Research Methods (b)
Research Methodology. Please Briefly differentiate!
RESEARCH METHODOLOGY
Research methodology may be defined as a way to
systematically solve the research problem. Research
methodology constitutes of research methods, selection criterion
of research methods, used in context of research study and
explanation of using of a particular method or technique and
why other techniques are not used so that research results are
capable of being evaluated either by researcher himself or by
others. Why a research study has been undertaken, how the
research problem has been formulated what data have been
collected and what particular methods have been adopted, why
a particular technique of analyzing data has been used and a
host of similar other questions are usually answered when we
talk of research methodology concerning a research problem or
study.
SCIENTIFIC METHODS
Scientific method is a collective term denoting the various
processes by the aid of which the sciences are built. In a wide
sense, method of investigation by which scientific or any other
impartial systematic knowledge is acquired is called a scientific
method.
Scientific method consists of systematic observation,
classification and interpretation of data. Scientific method is a
universally applicable, systematic method of understanding a
phenomenon and verifying the truth. Scientific methods have
the following characteristics.
a. Every conclusion viewed through a scientific method must
be verifiable.
b. Scientific laws are universally applicable and tested
wherever one wants to check their veracity.
c. The scientific conclusions are predictable.
d. Scientific methods are objectivity based
e. Scientific methods employ a systematic approach.
Activity 1.2
Discuss in view of scientific methods as to how Newton
ascertained Laws of Gravity by observing apple falling from tree
.
RESEARCH PROCESS
Research process consists of a series of actions or steps
necessary to effectively carry out research and the desired
sequencing of these steps. These are as follows.
a. Formulating the research problem
b. Extensive literature survey
c. Developing the hypothesis
d. Preparing the research design
e. Determining sample design
f. Collecting the data
g. Execution of the project
h. Analysis of data
i. Hypothesis testing
j. Generalization and interpretation
k. Preparation of the report
RESEARCH DESIGN
Decisions regarding what, where, when, how much, by what
means, concerning an enquiry or a research study, constitute a
research design. A research design is the arrangement of
conditions for collection and analysis of data in a manner that
aims to combine relevance to the research purpose with
economy in procedure. Research design must contain the
following aspects.
a. A clear statement of the research problem
b. Procedure and techniques to be used for gathering
information from population to be studied.
c. Methods to be used for processing and analyzing data
Activity 1.3
What is the difference between Research Process and Research
Design?
OBJECTIVES OF RESEARCH
The following are the objectives of Research
a. Academic Objectives: To gain familiarity with a
phenomenon or to achieve new insights into it. The
Academic object of research is the acquisition of
knowledge and it is the thirst for knowledge couple with
curiosity that has been the guiding force behind a rich
variety of research work independent of any material
incentive.
b. Utilitarian objectives: The primary goal of research,
immediate or distant, is to understand the organizational
culture, social life, social environment, decision making
processes etc and thereby gain a greater measure of
control over human behavior in the organization and
social context.
c. Research helps in portraying accurately the
characteristics of a particular individual, situation or a
group in the organization and leads to organization
redesign, and design of strategies of development.
d. Research may be used to determine the frequency with
which a certain thing occurs or with which it is associated
with something else.
e. It helps in testing a hypothesis or a casual relationship
between variables to determine the cause and effect
impacts.
MOTIVATION IN RESEARCH
The following are the possible motives for doing research:
a) Curiosity to know what is unknown: Curiosity is an
intrinsic trait of human mind and a compelling drive in the
exploration of mans surrounding environment. The
scientists undertake research to unveil and grasp the fact
underlying a phenomenon. Accordingly, curiosity for the
unknown is the basic motivating factor in social research.
b) The Research of causeeffectrelation: Science rests on
the implicit and unshakable faith of man that all events
have cause and that nothing can happen uncaused.
Industrial research is primarily concerned with
determination of causeeffect patterns of industrial
phenomenon and not merely with detailed description of
it. Thus research for cause is also a basic motivating
factor in research.
c) Interest in novel and unusual: Many events happen in the
life of man, society and organization which are sudden
and unexpected and do not fit in any known causation
pattern. Such cases are highly stimulating for scientists to
discover the real nature of such facts and see whether
these can be made to fit in the accepted laws of causation
or if not, do these require modification of our laws.
Accordingly an unusual happening stimulates research.
d) Refinement of techniques of Research: Research employs
a number of techniques to study various problems. The
efficacy of a technique is mainly responsible for quick and
reliable results in any investigation. More sophisticated a
technique, better are the results by its use. Moreover,
certain problems cannot be studied unless suitable
techniques are developed. Therefore, a scientist is keenly
interested in evaluating the current techniques of
research and making constant improvement in them.
Accordingly refinement of techniques is also a motivating
factor in research.
Activity 1.4
What is the role of planning commission of India towards
guiding government spending in development areas?
SIGNFICANCE OF RESEARCH IN
GOVERNMENT, INDUSTRY BUSINESS AND
TRADE
The role of research in several fields of applied economics,
whether related to business, industry, trade, commerce,
services or to the economy as a whole, has greatly increased in
modern times. The increasingly complex nature of business, its
size, fast changes in technology etc, has focused attention on
the use of research in solving operational problems. Research as
an aid to economic policy has gained added importance both for
government and business.
The following aspects emphasize the significance of research.
a) Government Policies: Research provides the basis for
nearly all government policies, in our economic system.
For example, governments budgets rest in part on an
analysis of the needs and desires of the people and on
the availability of revenues to meet those needs. The cost
of needs has to be equated to probable revenues and this
is a field where research is most needed. Through
research we can devise alternative policies and can well
examine the consequences of each of these alternatives.
b) Allocation of National Resources: Government has to
chalk out programmes for dealing with all facets of the
countrys existence and most of these will be related
directly or indirectly to economic conditions. The plight of
cultivators, the problems of big and small business and
industry, working conditions, trade union activities, the
problem of distribution, even the size and nature of
defense services are matters requiring research. Thus
research is considered necessary with regard to the
allocation of nations resources.
c) Investigation of Economic Structure: Research is
necessary for collecting of information on the economic
and social structure of the nation. Such information
indicates what is happening in the economy, and what
changes are taking place. Collecting such statistical
information involves a variety of research problems.
d) Social Welfare and Progress: Ignorance and lack of
knowledge is the root cause of various social events.
Communal troubles, religions riots, the misnomer of
social, racial superiority are results of ignorance. Through
research it is possible to d away with all these wrong
notions. Research is helpful in the welfare and progress of
humanity and the society.
Activity 1.5
NCAER conducts research on behalf of Government into
economic issues. Please see what kind of economic oriented
research is conducted and list few of them
..
e) Solution of Operational and Planning Problems of
Business and Industry: Operations research, market
research and motivational research are considered crucial
and their results assist in more than one way, in taking
business decisions. Market research is the investigation of
the structure and development of a market for the
purpose of formulating efficient policies for purchasing,
production, and sales. Operations research refers to the
application of mathematical, logical and analytical
techniques to the solution of business problems of cost
minimization or of profit maximization or what can be
termed as optimization problems. Motivational research is
concerned with market characteristics and determining
why people behave as they do. In other words, it is
concerned with the determination of motivations
underlying the consumer behavior. Research with regard
to demand and market factors has great utility in
business. Given knowledge of future demand, it is not
difficult for a firm or for an industry to adjust its supply
schedule within the limits of its projected capacity. Market
analysis has become an integral tool of business policy
these days. Business projecting which ultimately results
in a projected profit and loss account is based mainly on
sales estimates which in turn depends on business
research. Once sales forecasting is done, efficient
production and investment programmes can be set up
around which are grouped the purchasing and financial
plans. Research thus replaces intuitive business decisions
by more logical and scientific decisions.
f) New Knowledge: The curiosity and thirst for new
knowledge new facts for business cycles, environment
analysis and technological upgradation are the main
levers of research.
g) Organizational and Social Control and Prediction:
Through social research and organizational research we
study the social phenomena, events and the factors that
govern and guide them. This study is helpful in
organizational and social control and prediction of social
values, beliefs, traditions, events etc. It finds out new
facts and verifies the old facts on the basis of the
touchstone or tests applied to old facts. It also studies the
dynamics of social relationships and social phenomenon,
which helps in controlling social life and prediction of
social behavior.
Thus research is the fountain of knowledge for the sake of
knowledge and is an important source of proving guidelines for
solving different business.
Activity 1.6
There are many financial research companies in India. List 10 of
them.
..
SCOPE OF RESEARCH INCLUDES THE
FOLLOWS AREAS
a. Production Management: The research plays a
significant role in product development, diversification,
launching a new product, product improvement, process
technologies, selecting a site, new investment etc.
b. Personnel Management: Research helps in job
redesign, organization restructuring, development of
motivational strategies and organizational development.
c. Marketing Management: Research plays a significant
role in choice and size of target market, the consumer
behavior in terms of attitudes, life style, and influences of
the target market. It is the main instrument in deciding
price policy, selection of channel of distribution and
development of sales strategies, product mix, promotional
strategies etc.
d. Finance Management: Research helps in portfolio
management, distribution of dividend, capital raising,
hedging and taking care of fluctuations in foreign
exchange and product cycles.
e. Materials Management: It is used in selecting the
supplier, taking the decisions pertaining to make or buy
as well as in deciding negotiation strategies.
f. General Management: It helps in developing the
standards, objectives, longterm goals, and growth
strategies.
PRINCIPLES OF QUALITY RESEARCH WORK
The following principles must be observed in order to ensure the
quality research work
a. Objectivity: The purpose of research should be clearly
defined and common precept be used
b. Usage of scientific procedure: The research procedure
used should be described in sufficient detail to permit
another researcher to repeat the research in a systematic
manner.
c. Continuity: The research should be carried out in such a
manner that principle of continuity is ensured.
d. Proper Planning: The design should be planned in a
most scientific way and all aspect of resources, time
frame, constraints and procedural aspects be taken into
consideration.
e. Integrity: The researcher should report with complete
frankness, flaws in procedural design and evaluate their
effect on the findings.
f. Adequacy of Data: The analysis of data should be
sufficiently adequate to reveal it significance and the
methods of analysis used should be appropriate.
g. Reliability: The validity and reliability of data should be
checked carefully
h. Structure: It means that research is structured with
specified sequence in accordance with the well defined set
of rules. Guessing and intuition in arriving at conclusions
are rejected.
i. Logic: This implies that research is guided by the rules of
logical reasoning and logical process of induction and
deduction are used.
j. Empiricism: This implies that research is related
basically to one or more aspects of a real situation and
deals with concrete data that provides a basis for external
validity of research results.
k. Replicability: This principle allows research result to be
verified by replicating the study and thereby building a
sound basis for decision.
l. Economics: Research should be completed within the
allocated financial resources
m. Timeframe: Frame research should be completed in the
stipulated time frame.
Activity 1.7
(i) Out of 13 principles of quality research work which 2 you
think are most important and why?
(ii) What is the concept of Empirical Research and what are
its limitations?
..
PROBLEMS/LIMITATIONS OF RESEARCH
a. Lack of Training: The lack of scientific training in the
methodology of research is a great handicap for
researchers in our country. There is a paucity of
competent researchers in our country.
b. Lack of confidence: The business houses are often
reluctant to supply the needed information to research
because of fear of misuse of information.
c. Repetition: Research studies overlapping one another
are undertaken quite often for want of adequate
information.
d. Lack of Interaction: There is insufficient interaction
between the university research department, on the one
hand and business establishments, government
departments and research institutions, on the other.
e. Absence of Code of Conduct: There does not exist a
code of conduct for researchers and interUniversity and
interdepartmental rivalries are also quite common.
f. Lack of Resources: For conducting a quality research
adequate funds are not provided
g. Lack of Coordination: There exists lack of coordination
among various agencies responsible for conducting
research.
h. Problem of Conceptualization: Many a time problems
of conceptualization and problems relating to the process
of data collection and related things crop up resulting in
frittering of resources.
Activity 1.8
Strategic Research or competitive intelligence is a new area in
corporate world. Elaborate its usage.
ISSUES AND TRENDS IN RESEARCH
Research is a field that is rapidly changing and growing in
importance. Current issues and future trends in research include
the following:
a. Intense Competition: The intensity of worldwide
competition in all the areas of operations will increase.
The military warfare of the twentieth century will be
replaced by economic warfare in twentyfirst century.
Research will play a very dominant role in product
development, product design and product distribution
strategies.
b. Global Markets, Global Sourcing and Global
Financing: Few organizations and nations will be able to
survive by competing in domestic/international market.
Companies need to learn about foreign environment,
understand foreign customers, build networks and forge
partnerships. Research will be the key to success for
production and finance management in the context of
world becoming a global village.
c. Importance of Strategy: Companies will need long
term global business strategies to survive in the market
place. Vertically integrated partnerships will be needed to
strengthen competitive position. A new type of alliance
based on cooperative specialization within industries may
become a necessity. The strategic alliance will require
extensive background research.
d. Product Variety and Customization: An increased
variety of products and services will be offered to the
customer. In many cases, customization for the individual
will be possible. This means that the expected life of
products on the market will continue to decrease.
Therefore, product and service innovations based on
market research will hit the market at in increasing rate.
e. Pervasiveness of Services: Customer service is
destined to become another competitive battle ground
and research will assume a more significant role, in
designing and developing services.
f. Emphasis on Quality: The quality of products and
services will continue to improve as customer
expectations of quality grow. This will lead to more
emphasis on product research, social research, packaging
research, distribution research etc.
g. Advances in Technology: Technology will continue to
advance at a rapid rate, particularly in the areas of
advanced materials, advanced machining biotechnology,
robotics, supercomputing etc. This has resulted in inter
disciplinary research and more importance to
experimental designs.
h. Worker participation: The empowerment of the work
force has had a significant impact on the operations in the
past decade. The success of future companies depends on
the workers involvement resulting in more emphasis on
participative and motivation research.
i. Concern regarding Business Environment: The
impetus for environmental responsibility will shift from
government regulations to customer response
requirements leading to more concentration on business
research.
Activity 1.9
Technological Research is key stone of economic development of
industry. Please elaborate.
SUMMARY
In this Chapter basic concept of research, research problem,
research methods, research techniques, research methodology,
scientific methods, research process and research design have
been elaborated upon. The objectives of research and motives in
research have been highlighted. Importance and significance as
well as scope of research have been discussed. Principles of
good research have been enlisted. The issues and trends of
research in modern times have been described.
REVIEW EXERCISES
1. Defined the following terms:
a. Research
b. Research Problems
c. Research Methods
d. Research Techniques
e. Research Methodology
f. Scientific Methods
g. Research Process
h. Research Design
2. Explain the various objectives of research
3. Distinguish between Research Methods and Research
Methodology
4. What do you mean by Research? Explain its significance
in modern times.
5. Discuss the Factors motivating Research.
6. Explain the Principles of a Good Research
7. What is the Scope of Research in the present context of
opening of National Economy and globalization of
Markets.
8. Explain the problem and limitations faced in conduct of
Research.
9. Discuss the issues and trends of Research in the
Industrial Context.
FURTHER READINGS
 Booth, W.C, Colomb, G.G. & Williams, J.M, (2008). The
craft of Research. Chicago: University of Chicago Press.
 Kumar, Ranjit., (2005). Research Methodology; A step by
step Guide for Beginners. Newbury Park, CA: Sage
Publication.
 Cresswell, John, W., (2008). Research Design;
Qualitative, Quantitative and Mixed Methods Approaches.
Newbury Park, CA: Sage Publication.
 Marczyk, G.R, DeMatteo, D. & Festinger D., (2005).
Essentials of Research Design and Methodology, New York
City, NY : Wiley.
CHAPTER 2 IMPORTANCE OF RESEARCH IN
MANAGEMENT DECISIONS
Objectives
After reading this Chapter, learners will be able to:
 Understand the concepts, types and importance of
management decisions
 Appreciate scope of research for management decisions
and the limitations as well
 Note steps and issues involved in decision making through
research
 Understand operation research and other new trends as
applied to management decision making through research
Structure
 Fundamentals of Management Decisions
 Types of Management Decisions
 Importance of Research in Management Decisions
 Scope of Research in Management Decisions
 Limitations of Research in Management Decisions
 Issues in Decision Making, Through Research
 Steps in Decision Making Through Research
 Operation Research Techniques and Methods Applied to
Management Decisions
 Newer Trends in Decision Making Through Research
 Summary
 Review Questions
 Further Readings
FUNDAMENTALS OF MANAGEMENT
DECISIONS
Decision making is a well tried process of arriving at the best
possible choice for a solution within a reasonable period of time.
To come to a management decisions means to weigh options
and to come to a conclusion. Decision making involves two or
more alternatives, because if there is only one alternative, there
is no decision to make. Decision making is a conclusion, the
decision makers has reached as to what he or others whom he
leads should do at some points in time to the betterment of
business orations they serve.
CHARACTERISTICS OF MANAGEMENT DECISIONS
The following are the important characteristics of management
decision making.
a. An intellectual exercise: The process of decision
making is basically a human intellectual activity. It is a
mental exercise which considers and evaluates the
alternatives for realizing certain business objectives.
b. A process of selection: Decision making is basically a
process of selection. It chooses the best alternative
course from various alternative courses of action, to
optimize business performance.
c. Resolution of Commitment: Decision making is a
resolution of commitment of mind to act in a certain
manner in the given circumstances. It may mean to do or
not to do a thing.
[Type text] Page 40
d. Evaluation of Alternatives: Before taking any final
decision about anything, the decision maker evaluates
various pros and cons of the different alternatives.
e. Decision is always Result Oriented: Decisions are
means for implementing managerial work load. No
decision can be without a purpose. A managerial decision
is taken to realize certain business objectives.
ELEMENTS OF DECISION MAKING
The following are the important elements of management
decisions
a. Concepts of Good Decision: The first and important
element of the process of decision making is the
perception of decision. The decision should be sound, and
result oriented. The decision should be based on facts and
careful analysis of facts and figures.
b. Environment of Decisions: The management should
create a favorable environment in the organization
structure for good decisions. The decision environment
can be divided into two parts; internal and external. In
internal environment the labour management relations,
organizational pattern, the delegation of authority,
decentralization policy etc. are some of the important
factors. In external environment, sociopoliticotechno
factors are to be considered.
c. Psychological Elements: The decision making is a
human process so it is but natural that the decision taken
will be affected by the psychology of the decision maker.
[Type text] Page 41
Some personal attributes affecting decision are
intelligence, educational level, temperament, position and
attitude etc.
d. Timings of Decisions: All management decisions should
be taken as far as possible immediately and according to
circumstances.
e. Communication of Decision: The decision should be
communicated to the concerned parties as soon as they
are finally taken. Communication must be clear, simple,
easy and comprehensive.
f. Participation of Employees: As far as possible the
employees should be given due participation in the
process of decision making. They should be motivated
and trained for it.
Importance of Decision Making
In the present context of increasing complexity and size of
business and industry, on the one hand and fast changing
SocioTechnoPoliticoEconomic environment, on the other,
the importance of decision making cannot be
overemphasized.
Complexity of todays managerial activities which involve
constant analysis of existing situation, setting objectives,
seeking alternatives, implementing, coordinating, controlling
and evaluating decision made, clearly demonstrate the
significance of decision making. In fact, management and
administration is a decision making process. Whatever a
manager does, he does through decision making. The
[Type text] Page 42
manager has to take decision before acting, whether
consciously or unconsciously. Every manager is engaged in
the decision making process.
Decision involved: What is to be done, how it is to be
done, who is to do it, and when it is to be done. Some of the
decisions are of routine nature, and others are of strategic
nature which may require a lot of systematic and scientific
analysis. Whatever may be the case, it cannot be denied that
management consists largely of decision making process.
Decision making spreads over all the managerial functions
and covers all the areas of business. The management has to
take a large number of decisions while performing its
functions of planning, organizing, staffing, directing and
controlling. In fact, the very determination of objectives,
policies, programmes, organizational structure, motivational
aspects, personnel functions etc, is a decision making
process.
Activity 2.1
Life often brings you to a crossroad. Describe one such occasion
when you had to make a choice and choose a path. In view of
characteristics of management decisions enumerated.
They say time and tide waits for nobody. Comment why timing
of decision is of paramount importance.
Read Gollozils essay from websource listed under and list how
you could be more decisive in personal life.
[http://www.personaldevelopment.com]
Review the process of your decision making process for joining
MBA porgramme:
List under following:
Which Program:___________
Which College:_____________
What Mode:_______________
TYPES OF MANAGEMENT DECISIONS
Basically decisions are classified as planning decisions and
control decisions. Sub classification of decisions under these two
categories are shown in Figure 2.1
Figure 2.1: Classification of Decisions
However, some other authors classify decisions under the
following categories.
a. Programmed and Nonprogrammed Decisions
b. Routine and strategic Decisions
c. Policy and Operating Decisions
d. Department and Noneconomic Decisions
e. Organisational and Personnel Decisions
f. Major and Minor Decisions
Decisions
Planning
Decisions
Control
Decisions
Planning
Decisions of
Time Horizons
Planning
Decisions of
Risk Factors
Static Planning
Decisions
Dynamic
Planning
Decision
Short range
Medium range
Long range
Short range
Medium range
Long range
Risk Analysis
Decision Trees
Preference
Theory
[Type text] Page 45
PLANNING DECISIONS ON TIME HORIZONS
The planning is necessary for all levels of an organization. The
types of decisions the planner make can be characterized by the
freedom of choice among alternatives as displayed in Figure 2.2
The highest level has the maximum freedom of choice, however
risk involved is also a higher degree.
Figure 2.2: Freedom of Choice
At the top level, the aims and overall objectives of the
organization are determined. The constraints, limiting the
number of possible alternatives result from the composition of
the organization, and its environment, state of the economy,
role of the competitors, laws, resources of firm etc.
Each line lower in the hierarchical structure is further limited by
constraints listed above. Thus a goal selected at the highest
level gives direction to policy makers at the next lower level.
These policies are then the guidelines which the level below
uses to develop procedure and directions. At a still lower level
the directions are converted to work orders and rules.
Degrees of freedom
Overall objectives
The relationship between level of authority, number of
employees and time element is depicted in Figure 2.3. When
period covered is more than 5 years and risk involved is large
they are called long term decisions, while short term decisions
are those where time span covered is a few months or a few
weeks, depending upon the requirements. Medium range
decisions are those where time coverage lies in between 1 to 2
years or so.
Activity 2.2
(i) Briefly discuss the long term decisions and what is
called Time Horizon of decision?
(ii) Watch You Tube as per url below and comment on 3
questions of life time planning [ www.youtube.com
Lifetime Planning: the 3 Questions]
Figure 2.3: Time Horizons
STATIC PLANNING DECISIONS
There are routine decisions and related to short term policy and
operational problems, organizational and personnel difficulties,
departmental and noneconomic matters.
DYNAMIC PLANNING DECISIONS
Dynamic planning decisions take into account the changes
taking place during the course of action when the planning is
carried out. Necessary modifications and effect of changed
environments are incorporated even at final stage of decision
making process
a. Planning decisions under risk: These relate to the
decisions which surface at the time of launching a new
enterprise, introduction of a new product, merger,
acquisition, international collaboration, expansion and
modernization etc.
b. Longterm Planning Decisions: These pertain to major
objectives, policies and strategies that will govern the
[Type text] Page 48
acquisition and allocation of corporate resources. These
decisions provide the broad guidelines to accomplish the
strategic goals of the organization.
c. Mediumterm Planning Decisions: These are more
elaborate and comprehensive in nature. The process of
coordination is emphasized and problems concerning
sales, production, revenue costs etc are prepared. While
strategic planning is mad at top level of the company,
medium range planning is undertaken at the middle level,
i.e., at the division level or at the departmental level.
d. Shortterm Planning Decisions: These are concerned
with budgets and functional plans advertising, sales
promotion, employment training, inventory control etc.
Short term plans are generally prepared quarterly,
monthly or even weekly.
PLANNING UNDER DYNAMIC CONDITIONS
Analysis of economic forces under static conditions is useful
only to isolate the effects of uncertainty and thereby
development of further tools for analysis. Static decisions do
not exist in practice, and planning is, therefore, undertaken
under conditions of change and uncertainty. It is the
dynamic character of the environment that makes the
planning difficult.
The central problem under dynamic conditions is the
accuracy of a planners assessment of the future. Since the
future is uncertain although the degree of uncertainty may
vary widely as between products, markets, geographical and
political area and times when a manager estimates a future
situation, he necessarily makes certain assumptions as to
what will happen. As he weighs his contingencies in one way
or the other, he obtains different results. Suppose, for
example, that a manager was planning to build a new plant
and felt that he needed ten years to recover his costs. He
might assess the future with respect to markets, prices,
labors costs, material costs, utilization of plant, labour
efficiency, taxes and other factors. Suppose further that the
estimated six possible situations as being most likely to
occur created out of different assumptions as to the future.
These might bring completely different estimates of the net
profits as shown in Figure 2.4.
Figure 2. 4: Dynamic Planning Scenario
PLANNING INTANGIBLE DECISIONS
Intangible considerations are factors affecting a decision
which are particularly difficult to quantify. Such
considerations are typically more prevalent when the scope
of the problem is large and the time span longer. For
instance, the site selection in plant location problem is
influenced by quality of labour force, climate, transportation
facilities, reputation of the local bodies, banking and other
supporting facilities etc. These factors must be considered in
the plant location decision although they are difficult to
measure.
Figure 2. 5: Intangible Decision Making
The level so authority exposed to the most intangibles are
the levels best qualified to evaluate them. At the lower the
problems are easier to quantity as shown in Figure 2.5,
[Type text] Page 51
number of intangible considerations, number of conflicting
objectives and the amount of planning details required are
less at lower levels.
CONTROL DECISIONS
Control is an essential part of the decision making process. It
is concerned with the evaluation and measurement of
results. Effective performance of the management function
requires adequate measure of control.
Figure 2.6: Control Decisions
The process of control begins with an objective analysis of
goals of the decision maker. The determination of goals is
generally treated as a process of planning, control implies
that the decision maker wants to devise ways and means of
building a framework accomplishing the objectives.
Standards are established and deviations are measured
Output
against these standards and corrective actions are taken
based on the decisions taken in the light of intensity of
deviations.
PROGRAMMED AND NONPROGRAMMED DECISIONS
The programmed decisions deal with the routine and or
repetitive types of problems, while nonprogrammed
decisions deal with instantaneous and nonrepetitive types of
problems.
ROUTINE AND STRATEGIC DECISIONS
Basic or strategic decisions relate to policy matters and
usually involve large investments or expenditure of funds.
Routine decisions, on the other, are those which require little
deliberation or those which are made repetitively. For
example, sending samples of a product to the government
investigation centre is a routine decision, but lowering the
price of a product or installation of an automatic numerically
controlled machine centre is a major and strategic decision.
POLICY AND STRATEGIC DECISIONS
Whether to give exgratia bonus to employees or not is a
matter of policy to be decided by the top management, but
calculating the bonus in respect of each employee is an
operating decision which can be taken at much lower level.
DEPARTMENTAL AND NONECONOMIC DECISIONS
Departmental decisions are taken by the departmental heads
and relate to the department only. Decisions relating to the
[Type text] Page 53
noneconomic factors such as work ethos, values, moral
behavior etc. may be termed as noneconomic decisions.
ORGANIZATIONAL AND PERSONAL DECISIONS
When the executive take decisions in their official capacity it
is said that they have taken an organizational decision. On
the other hand, personal decisions relate to the executive as
an individual and not as member of an organization.
Major and Minor Decisions
Decisions related to purchase of a big machine worth say a
few lakh of rupee be called a major decision. On the other
hand, purchase of a fountain pen, or ink or a few reams of
paper are minor decisions and may be decided, by the lower
level of employees.
Activity 2.3
How can one mitigate some of these risks?
Give some examples of planning decisions under risk?
[Type text] Page 54
S _ R _ T E _ _ C DECISIONS
Why intangible factors so difficult to measure and quantify?
What is the concept of programmable decisions?
Why some time routine or mundane decisions taken so much
time?
What is the need of noneconomic decisions?
IMPORTANCE OF RESEARCH IN
MANAGEMENT DECISIONS
RESEARCH AND CORPORATE STRATEGY
Now a days great stress is laid down on integrating research
with corporate strategy. Integrating research means first and
foremost integrating research findings into technology and
business strategy then managing the research process including
its linkages broadly throughout the company, with the same
importance with which other critical corporate issues are
managed. In the proper strategic context, research should
promote the products that marketing offer, the process that
manufacturing operates, and many of the investment decisions
that management makes. Research has three major strategic
purposes:
a. To expand existing business: Existing business support
includes modifying products improve customer
acceptance or adapting them to different market
standards or regulations, using different or new raw
materials or improvements in the manufacturing
processes and dealing with regulatory activities such as
safety, considerations and environmental compliance.
Business support also includes developing new products
and manufacturing processes to improve competitive
position within the existing business structure.
b. Exploring New Business: It involves providing
opportunities for new business using existing or new
technologies. The new business may be new to the
company or new to the world. Similarly the new
technologies may be new to the world or new only to the
company.
c. Broadening and Deepening Technological
Capabilities: It may concern existing or new
technologies, depending on the perceived opportunities
and the companys competitive position.
Figure 2.7: Business Research visavi Industry Maturity
[Type text] Page 57
RESEARCH BASED MANAGEMENT DECISIONS FOR POSITIONING
IN INDUSTRY
The strategic mission of a company typically change as a
function of maturity of the industry in which the company
operates, hence industry research helps.
a. Embryonic Stage and Decision Making: The business
mission of research at the embryonic stage of the
industry lifecycle is to help launch the new business and
establish its position by demonstrating the validity of
product concept in one or more applications and by
establishing the viability of the manufacturing process.
The mission may also include doing what is needed to
establish and defend the companys intellectual property.
b. Growth Stage and Decision Making: During the
growth stage the purpose of research is to help grow the
business and improve or sustain its competitive position
T
h
e
R
e
s
e
a
r
c
h
M
i
s
s
i
o
n
and applications or by enlarging the application potential
of existing products through improved features and
reduced costs.
c. Maturity Stage and Decision Making: When the
industry becomes mature, the strategic role of research
usually shifts to one of defending competitive position by
extending the differentiation potential of products or
focusing on cost reduction. Management may decide to
rejuvenate the business, and this may also become the
responsibility of research.
d. Aging Stage and Decision Making: In an aging
industry the classical role of research has been cost
reduction and providing the customer support necessary
to safeguard profitability. Strategically, perhaps a
research thrust in the aging phase is to renew the
products or technology, of manufacturing and drive
competitors out of business rather than be driven out.
MANAGEMENT AREAS OF DECISION MAKING
a. Marketing
 Analysis of marketing research information
 Statistical records for building and maintaining an
existing market
 Sales forecasting
b. Production
 Production planning control and analysis
 Evaluation of machine performance
 Quality control requirements
 Inventory control measures
c. Finance, Accounting and Investments
 Financial forecast, budget preparation
 Financial investment decisions
 Selection of securities
 Auditing functions
 Credit policy, credit risk, and delinquent accounts
d. Personnel
 Labour turnover rate
 Employment trends
 Performance appraisal
 Wage rate and incentive plans
e. Economics
 Measurement of gross national product and input
output analysis.
 Determination of business cycle, long term growth
and seasonal fluctuations.
 Comparison of market prices, cost and profits of
individual firms
 Analysis of population, land economics and
economics of geography
[Type text] Page 60
 Formulation of appropriate economic policies and
evaluation of their effect
f. Product Development
 Development of new product lines
 Optimal use of resources
 Evaluation of existing products
Activity 2.4
what are the major and minor decisions points in inventory
management?
What are the major and minor decisions points in inventory
management?
Watch video on You Tube: [Bob Farris on Growth Stage of
Company] Comment on decisions made for growth stage of
company
Why Financial investment decisions Involve risk?
List 5 steps that TATA Motors would have taken for developing a
Nano Car!
SCOPE OF RESEARCH IN MANAGEMENT
DECISIONS
TYPES OF RESEARCH AND MANAGEMENT DECISIONS
The scope of research in decision making varies to a
considerable extent on types of organizations, field of operation,
and environment of operations. The following are the areas of
research where it contributes its influence in arriving at
decisions:
a. Incremental Research: The role of incremental
research is small advances in technology, typically based
on an established foundation of scientific and engineering
knowledge. A typical example of incremental research is
that of reducing manufacturing costs by a continuing
series of small but important advances: energy
conservation, computer guided process control, better
metallurgy, for lower maintenance cost etc. Although
each incremental improvement is small; in the aggregate
they typically produce meaningful savings. The small
incremental technical steps yield large strategic results
b. Radical Research: Radical research is the discovery of
new knowledge with the explicit goal of applying that
knowledge to commercial use. Discovery involves
substantial technical risk, cost and time. In radical
research, usually exploratory projects or feasibility
studies, intended to test the basic concepts on which the
scientific foundation of the project rests are taken. The
decision to enter the development phase takes place only
after successful research has already considerably
reduced uncertainty to levels acceptable to the business.
Consciously managing radical research is a means of
reducing risk.
c. Fundamental Research: Fundamental research is an
area where some of the most painful strategic decisions a
companys management must make. It wont pay off
many years and there will be a host of uncertainties
scientific, competitive, social and governmental. However,
market leaders do go for it for retaining technology
leadership and market share. Fundamental research is
generally carried out by the government, academic
institutions and by large industrial establishments.
d. Targeted Basic Research: Research projects that are
basic in nature but technologically oriented are called
targeted basic research. In this type of projects
governments and industry closely collaborate with each
other for opening up of applications of a new
technological area.
e. Applied Research: The result of applied research are
intended primarily to be valid for a single or limited
number of products, operations, methods and systems.
Applied research develops ideas into operational forms,
and lay emphasis on new processes, improving existing
product lines, or creating new ones, or some specific
aspects of a new technology relevant to the firm.
f. Innovation: Technological innovation refers to the
process of creation, evolution, and development of
technological artefacts. It refers to the range of activities
from product design to its development to its production
and adoption or use. Technological innovation decisions
include all the decisions pertaining to research, design,
development, market research and testing.
Activity 2.5
Would you classify discovery of cloning techniques radical
research?
Why most organizations conduct applied research?
..
LIMITATIONS OF RESEARCH IN
MANAGEMENT DECISIONS
The following are the main limitations/barriers for research in
management decisions.
a. External barriers: These are external to the domain of
firms control but are part of research. Conservatism
among customer dampen innovative spirits. Competitors,
action/reactions, legislation, consumerism,
environmentalism are other relevant barriers.
b. Management Barriers: Such barriers essentially relate
to attitudes, policy and sometimes operational issue or
communication system. New information may be blocked
due to its potential to disrupt the existing form of
organizational stability. There may be a lack of perceived
need to acquire new information and or lack of resource
potential to utilize the new information.
i. Inappropriate Justification: Managers very often
attempt to justify or reject cultivation of scientific
knowledge in terms of profitability. A better way of
looking at it is that it is knowledge that makes the
person who possesses it capable of practical action.
ii. No fear of failure: Creative freedom is prerequisite
for the accumulation of subsequent innovative
potential. An idea which comes to nothing should not
be equated with a failure.
c. Organizational Barriers: These pertain to both
structure and behavior. Many organizations have negative
potential for distorting or filtering the flow of new
information. Also sharply defined and limited roles may
bar flexibility to process new information. Further new
information may not be utilized due to the selective
perception and related biases that are operative in the
organization.
d. Historical Barriers: These represent the tradition and
conservatism within the organization.
e. Resource Barriers: This is a well known barrier.
Creation is more expensive than imitation. But to become
leader resource barriers have to be overcome.
f. People Barrier: Research/innovations are of immense
commercial value. Therefore it is the people who possess
intellectual capital who matter the most.
g. Miscellaneous factors: The other factors which affect
management decision making through research are:
i. Effect of education and training
ii. Financing of research
[Type text] Page 66
iii. Effect of norms and standards upon the new
product development
iv. Policy of employment and wages
Activity 2.6
Would you consider mobile telephony technological innovation?
Why would you say so?
What are usual organizational barriers for research?
ISSUES IN DECISION MAKING THROUGH
RESEARCH
a. Profitability: All factors being the same, profit denotes
managements ability to discover profitable opportunities
and pursue them vigorously. Profit is the surplus and
research can ask for a share of the surplus.
b. Early External Competition: The competitive pressure
of the market can spur the project staff to put forth a
greater effort.
c. Favorable internal competition: Internal competition
for resources can be a healthy motivator especially if the
projects are in similar areas.
d. Top management support: if the project personnel
premise that top management is very much interested in
the project they are much more enthusiastic than when
they have an opposite impression.
e. Research project personnel commitment: an
absolutely essential ingredient for research projects
success is the commitment of personnel. Without such
commitment there will not be full scale effort, and even a
small problem is looked upon us a challenge, without
commitment they can be devastating.
f. Chance event with positive impact: This gladdens the
heart of all believers. Luck or chance event does play a
significant role in successful completion of several high
technology research projects.
g. Probability of commercial success: Increase in the
probability of commercial success will spur the research
team for the smell of success is round the corner.
h. Presence of a Research Project champion: Presence
of a research project champion make the project a
success.
Technological Route: The development of a product or feature
goes smoothly, if the technology is well understood.
Activity 2.7
Why top management support is considered essential for
decision making?
Do you think formulation of anti HIV drug has great probability
of commercial success?
STEPS IN DECISION MAKING THROUGH
RESEARCH
PROCEDURE OF DECISION MAKING
a. Define and crystallize problems: Since the purpose of
formulating the problem is to determine the optimum
course of action from among various alternatives,
measures of effectiveness, as well as goals, must be
clearly defined. For example, in production and
distribution planning problem, the decision maker will
probably wish to minimize operating costs, minimize
investment in inventory, satisfy a level of customer
service and optimize the use of capital investments. To
measure effectiveness in reaching these goals and to
formulate the problem so that the multiple objectives
satisfied on a balanced optimum basis particularly in the
light of a variety of inputs can become a very complex
conceptual and computational matter. The simplest
approach is to use certain goals as constraints by saying,
for example, the goal is minimum costs while maintaining
a certain fixed level of inventory or customer service.
b. Assembly and analysis of pertinent facts: After
defining the problem, the manager should collect the
facts and analyze them. The various factors such as to
what length of time does the decision on the other area
and functions and the quality considerations should be
taken into account in analysis and decision making
process.
c. Develop alternative solutions or course of action:
Assuming known goals and clear planning premises, the
next step of decision making is the development of
alternatives. The purpose of the finding alternative
solutions is to make the best decision, after a careful
consideration of the most desirable course of action in the
circumstances of the case.
d. Evaluating alternatives: Once appropriate alternatives
have been isolated, the next step in planning is to
evaluate them and select the one that will best contribute
to the goal. This is the point of ultimate decision making,
although decision must also be made in other steps of
planning in selecting goals, in choosing critical premises
and even in selecting alternatives.
There are two approaches available for evaluating an
alternative.
1. Marginal analysis approach: In this approach
additional revenues from additional costs are compared.
Thus where the objective is to maximize profits, this goal
will be reached when the additional revenues and
additional costs are equal. Marginal analysis can be used
in comparing factors other than costs and revenues.
Perhaps the real usefulness of the marginal technique to
evaluation is that it accentuates the variables in a
situation and deemphasizes average and constants.
Whether the objective is optimum profits, stability or
durability, marginal analysis will show the way.
2. Cost Effective Analysis: As improvement or variant on
traditional marginal analysis is cost effectiveness or cost
benefit analysis. It is a technique of weighing alternatives
where the optimum solution cannot be conveniently
reduced to rupees or some other specific measures as in
the case of marginal analysis which is, in actuality a
traditional form of cost benefit analysis. In its simplest
terms cost effectiveness is a technique for choosing from
among alternatives to identify a preferred choice when
objectives are far less specific than those expressed by
such clear quantities as sales, costs, or profits. The major
features of cost effectiveness are concentration on output
from a programme or system, weighing the contribution
of each alternative against its effectiveness in serving
desired objectives and comparison of costs of each in
terms of its effectiveness. As illustrated in Figure 2.8 this
can show how much effectiveness can be bought out for a
certain cost for each alternative and how much
effectiveness can be had for any alternative for any given
cost. Cost effectiveness can be made most systematic
through the use of models and other operation research
techniques. Cost models may be developed to show cost
estimates for each alternative, and effectiveness models
to show the relationship between each alternative and its
effectiveness. Then synthesizing models, combining these
results, may be made to show the relationships of costs
and effectiveness for each alternative.
3. Convert the Decision into Effective Action: Since the
manager must not only make correct decisions but must
make them as needed and as economically as possible,
and since he must do this often and put guidelines as to
the relative importance of the decisions. Decisions of
lesser importance need not require thorough analysis and
research, and they may even safely be delegated without
endangering an individual managers basic responsibility.
Figure 2.8 : Evaluation on Costeffectiveness
Activity 2.8
Why cornerstone of any decision making process requires
evaluating alternatives?
When we travel from one place to another, why we take always
most costeffective decision?
It is said too much analysis leads to paralysis.
Do you agree!
Are terms operation research and management science
synonymous. Are there any fine line difference between the
two?
OPERATION RESEARCH TECHNIQUES AND
METHODS APPLIED TO MANAGEMENT
DECISIONS
Operations Research Techniques: One of the most
comprehensive research and analysis approaches to decision
making is operation research or as it is sometimes called,
operation analysis or management science. Operation research
like accounting analysis or correlation analysis, does not provide
decision but develops quantitative data to help the manager
make decisions. In most business situations analysis cannot be
so complete or conclusive that they constitute the decision.
However, in a production planning or transportation problem,
the goods may be so clear, the input data so definite and the
conclusions so workable as to point positively to the optimum
solution.
Applying operation research involves following six steps.
a. Formulate the problem: The first step in carrying out
operations research is to formulate the problem and
analyze the goods and system in which the solution must
operate.
b. Construct a Mathematical Model: The next step is to
formulate the problem as a system of relationships in a
mathematical model.
c. Derive a Solution from the model: After developing
mathematical model, next step is to obtain optimum
solution either by analytical procedure or by numerical
method.
d. Test the model: Because a model by its very nature is
only a representation of reality and it is seldom possible
to include all the variables, models should usually be
tested. This may be done by using the model to solve a
problem and comparing the results so obtained with what
actually happens. These tests may be carried out by using
past data, or by trying the model out in practice to see
how it measures up with reality.
e. Provide controls for the model and solution: Because
a model once accurate may cease to represent reality or
the variables believed to be beyond control may change
in value or the relationships of variables may change,
provision must be made for control of the model and the
solution.
f. Put the solution into effect: The final step is to make
the model and the inputs operable.
Activity 2.9
Watch video on You Tube [Importance of Operations Research
by Prof. Leon Lasdon]. List 5 important factors
.
OPERATION RESEARCH TOOL BOX
The mathematical tools of operation research are gaining
popularity, for their obvious advantages in problem solving.
They help a problem to be stated with precision rather than
beating about the bush, tackle it even when there are so many
variable factors influencing the decision and formulate the goals
to be achieved clearly rather than vaguely or broadly. In short,
the mathematical approach directs the thinking to strike at the
root of the problem. The important tools in operations research
tool box are:
a. Probability Theory: This important statistical device is
based upon the inference from experience that certain
things are likely to happen in accordance with a
predictable pattern. Thus, if a coin is tossed a hundred
times, it is probable, although by no means certain that it
will fall head fifty times. However, the deviations from
such a probability are within a fairly predictable margin
and consequently the probability becomes a workable
substitute for data otherwise unknown. In an enterprise
problem, where probabilities can be substituted for
unknowns, the margin of error in the solution, although
not removed, is limited.
b. Game Theory: This tool is based upon the premise that a
man seeks to maximize his gain and minimize his loss,
that he acts rationally and that an opponent will be
similarly motivated. Under these circumstances, game
theory attempts to work out an optimum solution in which
an individual in a certain situation can develop a strategy,
which regardless of what his adversary does, will
maximize his gains or minimize his losses. Even though
the mathematical development of game theory has not
proceeded beyond the stage of the simplest competitive
situations and there is little evidence that it has been very
useful in actual planning; future development of this
theory may have remarkable impact on the scientific
approach to strategic planning in competitive situations.
c. Queuing or waiting line theory: This theory uses
mathematical techniques to balance costs of waiting
versus the cost of creation of multiple lines.
d. Linear programming: This technique is based upon the
assumption that a linear or straight relationship exists
between variables and that the limits of the variations can
be determined. For example, in a production shop, the
variable may be units of output, per machine in a given
time, direct labour costs or material costs per unit of
output, number of operations per unit and so forth. Most
or all of these may have linear relationship, within certain
limits and by solving linear equations, the optimum in
terms of cost time, machine utilization or other
objectives, can be established. Thus this technique has
had its most promising use in such problem areas as
production planning, shipping rates and routes and the
utilization of production and warehouse, facilities to
achieve lowest overall costs including transportation
costs. Because it depends on linear, be accurately enough
stimulated, newer and more complex systems of linear
programming have come into use.
e. Corelation theory: All business decisions are affected
by a number of variables among them the businessmen
know by experience which are the most important for
their concern. By corelation techniques it is possible to
evaluate their relative importance and to construct a
mathematical expression explaining the movement in
series. Substituting assumed future values for the
variables, it enables a forecast being made on future
trends.
f. Information Theory: The information theory closely
related to the communication theory and applied largely
to the fields of management and engineering, is based on
the volume of knowledge and information gap built up.
g. Inventory Theory: The inventory theory is also useful in
the solution of production and purchasing problems. It is
known as the economic lot size theory. Selective
inventory control involves the classification of repetitive
stores according to their value, nature of rates of
consumption, lead time and ease or difficulty of
availability.
h. Network Methods: Problems such as the movement of
commodities, construction of civil engineering projects,
and planning of shut down maintenance are capable of
diagrammatic representation in the form of network
which, with the help of relevant data, can be subjected to
mathematical treatment, through PERT/CPM. It is
applicable in almost any situation calling for scheduling or
requiring exact timing and performance to be followed.
Activity 2.10
Look up interactive tutorials on Game theory. List 5 key points
regarding this theory. [www.economicsnetwork.ac.uk]
Watch You Tube Video Dan Cray Inventory and list 12 Steps!
[www.youtube.com Dan Cray Inventory]
Read model building tutorial. List the 4 step model building
process.
[
[www.cs.cmu.edu/~vmr/tutorials]
NEWER TRENDS IN DECISION MAKING
THROUGH RESEARCH
The modern approaches to decision making emphasize the
process of model building, mathematical analysis, and
computer. Management decisions will be influenced greatly
by the use of computer and mathematics.
Decision making will be viewed in a broader context. It will
include entire process of setting of goals, defining the scope
of problems, developing choices and implementing the
options selected.
The effectiveness of decision depends upon the degree to
which an action leads to results for which it was planned. For
any objective analysis, effectiveness must be expressed in
measurable and quantifiable terms or units. The application
of science in administration should improve the effectiveness
of decision. Although experience and institution are still
important, considerable progress has been made in use of
quantitative analysis for decision making. In brief, the
current issues and trends in the field of decision making can
be summarized as follows.
a. Global competition is forcing companies to become more
effective in developing new products or process. One of
the industrys response to this challenge is to tap external
sources of technical expertise more extensively.
b. Research cooperation will be identified based on time
period (long, medium, short and on the target focus) of
the relationship.
c. Interfirm research cooperation will be more intensive.
Customer supplies partnerships to develop new products,
special purpose machines, linkages between a company
and its sub contractors and linkages between small
innovative firms and their larger competitors for the final
development and marketing of promising ideas ill get
more attention.
d. A sharp increase in the mechanisms for the collective
funding of research throughout the industrial world will be
witnessed.
e. Technology development will be the order of the day.
f. Pooling research resources to achieve a critical minimum
mass, and inter disciplinary research, are the latest
trends.
g. The current trends in the field of decision making through
research are as follows:
i. Increasing use of tools and models in decision
making
ii. Increasing application of behavioral science to the
problem of administration decision
iii. Greater use of computer applications and
advances in management information systems,
which is the backbone of the decision making
process.
Activity 2.11
Watch You Tube video [908 Germany Joint Research
Cooperation]. List in what ways Germany is participating in
cooperative research.
[www.youtube.com 908 Germany Joint Research Cooperation video]
List six areas in Management decision making where complete
applications are most useful!
SUMMARY
In this Chapter the definition, characteristics and elements of
management decision making have been explained. The
significance and importance of decision making has been
highlighted. Types of decision have been described. The
importance of research in decision making has been highlighted.
Research and its importance in industry life cycle decisions
and areas of decision making have been discussed. Scope of
research in decision making has been incorporated. The
limitations of research in decision making have been described.
The procedure or steps of decision making have been explained.
O.R. tools and techniques of decision making have been
included. The newer trends of decision making have also been
highlighted.
REVIEW QUESTIONS
1. What do you understand by decision making? What are
its basic elements?
2. Decision making is the primary task of a manager.
Discuss and explain the scientific process of decision
making.
3. What are the principles of decision making? Discuss the
role of employee in decision making.
4. Explain types of decisions
5. Decision making is the essence of management. Discuss.
6. Explain the quantitative techniques of decision making.
7. Discuss the role of research in decision making.
8. Write short notes on:
a. Planning Intangible Decisions
b. Routine and strategic decisions
c. Research and industry life cycle decisions
d. Issues and trends in decision making
e. Principles of decision making
f. Limitations of research in decision making.
FURTHER READINGS
 Zikmund, W.G., Babin, B.J., Carr, J.C. and Griffin, Mitch.
(2009). Business Research Methods. Chula Vista, CA:
South Western College Publication.
 Ethridge, Don E., (2004). Research Methodology in
Applied Economics. Daryaganj, ND: Wiley Blackwell,
 Bergh, D. & Ketchen, D. (2009) Research Methodology in
Strategy and Mangement. Binglay, UK: Emarald Group
Publishing.
CHAPTER 3
DEFINING RESEARCH PROBLEMS AND
FORMULATION OF HYPOTHESIS
Objective
After reading this Chapter, learner would be able to:
 Appreciate the necessity and factors underlying for
definition of research problem.
 Understand both procedure and prerequisites
towards defining a research problem
 Understand various terms related to hypothesis
and its formulation
 Appreciate the steps in the hypothesis and its
confirmation
Structure
 Research Problem Defined
 Necessity of Defining a Research Problem
 Factors to Be Considered While Defining a Research
Problem
 Procedure of Defining a Research Problem
 PreRequisites of Defining a Research Problem
 Hypothesis and Related Terms
 Difficulties in Formulation of Hypothesis
 Steps in Testing the Hypothesis
 Characteristics of a Good Hypothesis
 Confirmation of Hypothesis
 Summary
 Review Questions
 Further Reading
RESEARCH PROBLEM DEFINED
A research problem is some difficulty either of a theoretical or
practical nature which an individual or organization faces and
wishes to obtain a solution for the same. A research problem
must contain the following.
a. An individual or an organization which has the problem
b. They must occupy some environment/condition to which
the difficulty pertains.
c. Some objective/goal to be attained
d. Some alternative course of action through which these
objectives can be attained.
e. Researcher must have some doubts regarding the
selection of possible alternatives.
NECESSITY OF DEFINING A RESEARCH
PROBLEM
It is important to formulate a research problem properly. In
fact, problem formulation is even more essential than its
prospective solution. A carefully defined research problem does
not let a researcher stray from the research path that should be
followed. It is therefore, concluded that only upon a detailed
definition of the research problem, the researcher can progress
with the design of research methodology. This also leads to a
smoother progress on all the subsequent steps that are involved
in completing a research project.
Activity 3.1
Open the power point presentation Defining Research
Problem. List the various steps given in its definition
[Web resource: www.web.squ.edu.om/Prathapar.ppt]
FACTORS TO BE CONSIDERED WHILE
DEFINING A RESEARCH PROBLEM
While selecting the research problem, the following
considerations should be borne in mind.
a. Economic Considerations: Research design efforts cost
money. The value of the anticipated results must be
commensurate with the efforts put in. Short research
problems which can yield appreciable dividends quickly
are to be preferred to long term research problems whose
benefits may be difficult to foresee.
b. Technical Considerations: It should be made sure that
adequate technical knowledge is available with which to
carry out the research problem. Whereas large problem
throws up a number of subjects which are independent of
each other, it is better to have small individual research
problems instituted on each subject.
c. Human Considerations: Where resistance to change or
reaction is likely to be great, peoples participation and
involvement must be ensured.
d. Environmental Consideration: Controversial subject
should not be chosen for research, until and unless very
much warranted. The selection of problem must be
preceded by a preliminary study. Problem that are very
narrowly defined or have a vague outcome should not be
undertaken. It is also believed that a researcher should
be familiar with the domain area in which he/she wants to
conduct the research investigation.
e. Limitations and Constraints of Research problem:
There are as follows:
i. Time limit: The research must be completed by a
prescribed date.
ii. Resource Constraints: The research must be well within
the stipulated resources allocated for research study.
iii. Policy Constraints: The research problem must give
considerations to policy constraints.
[Type text] Page 88
PROBLEM
The technique of defining a research problem involves the
following steps.
a. Define the Problem in a General Way: The research
problem must address either a specific practical
operational issue or some scientific discovery. It can also
be pertaining to satisfaction or broadening of a particular
intellectual curiosity. Whatever the area of research, the
problem definition should generally be at a logical level.
b. Understanding the Nature of the Problem: The
researcher must understand the origin and nature of the
problem in clear terms through discussions and study of
the environment within which problem is to be solved.
c. Literature Survey: It is important to review and serve
all the possible literature that is available on the research
area prior to defining the research problem. It assists a
researcher to look into newer dimensions in that
particular area and leads to enhancement of knowledge.
d. Experiential Advice: Persons who have knowledge or
have rich experience in the area of research have proved
to be good sounding board for a researcher. Their advice
and comment on research proposal help a researcher to
have better clarity and focus on his research topic.
e. Redefining the Research Problem: Many a times, a
problem redefinition happens once the steps listed in a, b,
c & d above are undertaken. Researcher often redefines
the problem in a manner which is more viable and logical
for the conduct of the research. This effort also helps in
defining hypothesis more sharply.
Activity 3.1
What would be economic consideration while formulating
research on Disaster Management?
How has information technology helped researchers to do more
comprehensive yet faster literature review?Please enumerate 5
tips.
PREREQUISITES OF DEFINING A RESEARCH
PROBLEM
The following prerequisites are to be met in defining a research
problem
a. Technical terms and scientific words or phrase used in
statement of problem should be defined in simple words.
b. Basic assumptions related to the research problem
should be expressed clearly.
c. A clear criterion for the selection of problem should be
provided.
d. The suitable source of data as well as timeperiod must
be considered in defining the problem.
e. The scope limitation of the research must be expressed
fully and clearly in defining research problem
Activity 3.3
What is the concept of Experiential Learning?
Do you agree with adage Best learning comes from school of
hard knocks
See power point presentation Hypothesis in Research
Comment on Building blocks of hypothesis variables
HYPOTHESIS AND RELATED TERMS
DEFINITION
A hypothesis may often germinate as a probable generalization
and whose validity is not yet been tested. Even a hunch or a
novel idea may transform into a hypothesis which then becomes
the actionable agenda of a research project.
A hypothesis may be defined as proposition, condition or
principle which is assumed, perhaps without belief, in order to
draw out its logical consequences and by this method to test its
accord with facts which are known or may be determined.
IMPORTANCE OF HYPOTHESIS
The success and effectiveness of induction depends crucially
upon the elimination of unnecessary and irrelevant facts and
picking out of relevant facts. Now, this elimination and selection
of facts hinges upon our hypothesis. Almost every great step in
the history of science has been made possible by the
anticipation of Nature, i.e., by the invention of hypothesis
which, though variable, often had very little to start with. No
observation is possible, if we do not have some hypothesis in
mind. The function of hypothesis is to direct our research for an
order among facts. The suggestion formulated in the hypothesis
may be solution to the problem.
ORIGIN OF HYPOTHESIS/SOURCES OF HYPOTHESIS
There is no one unique way of forming hypothesis. As discussed
earlier, it depends upon a researchers knowledge base and his
research abilities which lead to definition of hypothesis. Though,
there are certain suggestions enumerated below which could
assist in developing hypothesis.
a. Induction by Simple Enumeration: We observe that all
birds, irrespective of size, have beaks and wings. From
this observation, a researcher can form the hypothesis
that birds are creatures with wings and beaks.
b. Method Agreement: In this approach, various objects
forming a group to have a common circumstantial
background can lead to development of a hypothesis. For
example, a doctor discovers that all lung cancer patients
are chain smokers, then he can easily formulate a
postulate that chain smoking leads to throat cancer.
Another example could be that, if passengers in a bus felt
giddy then it could be concluded that toxic fumes entered
the passengers compartment. Therefore, hypothesis can
be relevant to a problem if it expresses determination of
connection between a set of facts.
c. Analogy: Some times certain commonalities between
various things can lead to formulation of hypothesis. For
example, if we find that many people from North Eastern
States have slanted eyes, we can hypothesis that all
other North Eastern people also have slanted eyes
because they belong to North East. Such a hypothesis
based on a phenomenon called analogical reasoning.
d. Concomitant Variation: At times, hypothesis can be
formulated about two phenomena, by correlating
relationship between them. For example, if we find that
chocolate in attractive wrappers sells more than those in
simple wrappers, a hypothesis can be formulated that
chocolates if wrapped attractively will sell more.
e. General Cultural: The general culture pattern facilitates
in formulating a hypothesis, and also to guide its trend.
The mythological basis in Indian culture may form a
suitable basis for hypothesis of a religious research.
f. Scientific Theory: Many times, it is the scientific theory
that provides underpinning of a hypothesis. Such
scientific theories enable researchers to formulate
hypothesis which could be extension or corollaries of such
scientific theories.
g. Personal Experience: Sometimes the facts are there,
but a right individual sees it in the right perspective and
formulates a hypothesis.
KEY OF HYPOTHESIS
These are as follows;
a. Explanatory or Descriptive Hypothesis: A hypothesis
may be about the cause of a phenomenon or about the
law of which it is an instance. A hypothesis about cause is
explanatory whereas a hypothesis about law is
descriptive.
b. Tentative Hypothesis: when a phenomenon cannot be
fully understood because of technical difficulties e make
tentative hypothesis about it and see how far this is
successful in explaining. Sometimes we simultaneously
test two or more hypotheses. The famous hypothesis
about propagation of light namely waves theory and
corpuscular theory of light both explain the phenomenon
of light but none of them are final. They are tentative.
c. Representative Fictitious Hypothesis: Some
hypothesis consists of assumptions as to the certain
phenomenon; these assumptions can never be proved by
direct means. Their only merit is their suitability to
express the phenomenon. They are Representative
Fiction. Einsteins formula,E=mc
2
is an instance of
representative fiction.
A representative fictitious hypothesis which proves to be
correct becomes a theory or law. The law of gravitation was
a hypothesis in Newtons mind but when it proved to be true
it became a law.
Activity 3.4
(i) Analogy plays a great part in understanding many issues.
Give an example from everyday life which can lead to
analogical hypothesis creation
(ii) Read about Explanatory Hypothesis as presented in the
web resource. List 4 learning from it
[www.scribd.com >Presentation>Research]
DIFFICULTIES IN FORMULATION OF
HYPOTHESIS
The following are the difficulties encountered in formulation of
the hypothesis.
a. Lack of Clear theoretical background: If researcher
does not have a clear cut theoretical background then it is
not easy to formulate a hypothesis.
b. Lack of Logical Background: if researcher is lacking in
logical use of the theoretical background then also
formulation of hypothesis will be very difficult.
c. Lack of Knowledge of Scientific Methods: It is always
not possible to have complete information of an
acquaintance with the scientific methods for formulation
of hypothesis. This lack of scientific knowledge presents
difficulty in formulation of hypothesis.
Ways and means of removal of difficulties with formulation of
hypothesis: For overcoming the difficulties in formulation of
hypothesis following steps shall be taken:
1. Complete and perfect knowledge of the principles, and
practices of the discipline in which hypothesis to be
formulated has to be acquired through training, special
programmes, conference seminar etc.
2. From the very beginning the hypothesis should be brief
and timely.
3. The hypothesis should become elaborate as it proceeds in
the field of research.
Activity 3.5
If a researcher is lacking in logical background can he be
successful? If not then what are the reasons?
STEPS IN TESTING THE HYPOTHESIS
a. Observation: Observation is a precondition of formulation of
a hypothesis. Unless e perceive a difficulty or problem and do
not feel the inner goading for solving it, we do not reflect.
Therefore, observation is the first stage of hypothesis making.
b. Reflection: Having felt a difficulty and need for a solution we
consider the problem by perceiving the relevant facts. For
example, we see a sea in a high tide and also find clear moon
above. Now we anticipate a relation which is based upon
experience, namely, whenever there is high tide there is full
moon and never otherwise as far as our experience goes.
Having established a relation between two facts, we now
formulate an answer for the why of this relation. This answer is
hypothesis.
c. Deduction: The third and the last step in this process is
testing of hypothesis, various deductions possible from it and
their mutual compatibilities and correspondence with already
known facts. For example, if we have a hypothesis that madness
increases with increasing complexity of civilization, it will follow
from this that there are more mad persons in New York today
than in Delhi today. Now this in fact is not true. Therefore, our
hypothesis is defective, because certain facts which follow from
it are false. Thus deduction is extremely useful in rejecting ill
formed hypothesis.
d. Verification: Actually, verification is post hypothesis
formulation and therefore is not a step in its formulation, but in
as much as our interest in making hypothesis is not purely
academic or theoretical, we wish to solve our difficulty and this
difficulty can be solved, if we actually test our hypothesis.
Activity 3.6
(i) Watch the video Hypothesis Testing from You Tube
List 5 major tools discussed in the video
Tell whether the following hypothesis is good or not.
All tall boys are good sportsmen
Or
Water boils at 100
0
C
Or
G B
G B
[www.youtube.com Hypothesis Testing]
CHARACTERISTICS OF A GOOD HYPOTHESIS
Hypothesis or workable hypothesis must have the following
characteristics:
a. Specific in Nature: A hypothesis must not be vague or
too general. It should be very specific. It means that the
hypothesis should be quite narrow and up to the point.
b. Simplicity: A hypothesis must be simple, clear and
understandable.
c. Conceptual Clarity: The hypothesis must lay down the
concept quite clearly and the problems and the definitions
that are used in the hypothesis should be properly and
universally accepted terms.
d. Brevity: The hypothesis should be brief and be stated in
scientific terms.
e. Related to Theory: Hypothesis must be corollary or in
continuation with the theory already verified. If it is, it will
help the development and growth of science.
f. Related to Technique or method: Hypothesis must be
related to the available techniques or scientific method.
Then only it will be capable of being tested or verified.
g. Capable of Empirical Test or Experiments: The
workable hypothesis should be based on the existing
experience and capable of the empirical test. It should
not be a mere moral judgment.
CONFIRMATION OF HYPOTHESIS
A hypothesis may be directly or indirectly confirmable. It is
confirmed directly if some observation or experiment can test it.
The hypothesis that coffee taken at night makes a man
sleepless can be tested by giving coffee at night to a number of
people a number of times and observing its effect on them.
Where we cannot confirm a hypothesis directly, we consider
consequences derivable from it or we may examine the validity
of its opposite consequences.
Activity G
World is Flat Is it a good example of brevity in Hypothesis?
Population grows in geometric progression while food
production grows in arithmetic progression Is it a good
example of theory based hypothesis?
What is Hempels Raven Paradox Concept in confirmation of
hypothesis?
[Read web resource and answer www.mathacademy.com/pr/prime/articles]
SUMMARY
In this Chapter the definition and components of research
problem have been highlighted. The necessity of defining a
research problem and factors to be considered in defining a
research problem have been explained. Procedure and pre
requisites of defining a research problem have been discussed.
The definition of hypothesis and its importance and its kinds
have been incorporated. Difficulties in formulation of hypothesis
and means and ways of removing difficulties the formulation of
hypothesis have been discussed. Steps in testing hypothesis,
and conditions and conditions and confirmation of a good
hypothesis have been explained.
REVIEW QUESTIONS
1. How does one define a research problem? What are the
main attributes that researcher should keep in mind while
formulating research problem?
2. What is the necessity of defining a Research Problem?
Explain the Components of a Research Problem.
3. When a research problem is defined there is a step by
step procedure followed. Explain why this is so?
4. What is the procedure of Defining a Problem? Explain the
factors to be considered in designing a Research Problem.
5. Explain the prerequisites of designing a Research
Problem.
6. Define the term Hypothesis and explain its importance.
7. Explain the Sources of Hypothesis.
8. Discuss the Types of Hypothesis.
9. Explain the difficulties encountered in Formulation of a
Hypothesis. How can these difficulties be removed.
10.Explain the steps in Testing a Hypothesis.
11.What are the Conditions of a Good hypothesis?
12.Explain the Confirmation of Hypothesis.
FURTHER READINGS
 Cresswell, John, W.,(2008). Research Design; Qualitative,
Quantitative and Mixed Methods Approaches. Newbury
Park, CA: Sage Publication.
 Lehmann, E.L. & Romano, J.P., (2008). Testing Statistical
Hypothesis. New York, NY : Springer Publishing Company.
 Kumar, Ranjit., (2005). Research Methodology; A step by
step Guide for Beginners. Newbury Park, CA: Sage
Publication.
CHAPTER 4 RESEARCH DESIGN
Objectives
After reading this Chapter, the learners should be able to:
 Understand the fundamentals and significance of god
research design
 Appreciate features of good research design leading to
objectivity of research
 Comprehend the various types of research design and
steps needed to formulate a good research design.
Structure
 Fundamentals of a Research Design
 Significance of a Research Design
 Features of a Good Research Design
 Steps in a Research Design
 Objectivity in a Research Design
 Types/Forms of a Research Design
 Summary
 Review Questions
 Further Reading
FUNDAMENTALS OF A RESEARCH DESIGN
RESEARCH DESIGN DEFINED
Research design can be defined in many ways. Some of the well
known definitions are summarized as follows:
a) It is a conceptual framework under whose umbrella
research will be conducted.
b) It is a design for collection of measurement and
analysis of data
c) It is a decision matrix which looks into the aspects of
5WH [What, Where, Which, Where & How] as they
pertain to a research enquiry.
d) It can also be defined a mesh of boundary conditions
for collection and interpretation of data which in turn
leads to efficient conduct of research procedure.
ELEMENTS OF A RESEARCH DESIGN
The important elements of a research design are as under:
a) A specification of the sources and kind of information
needed for conducting research
b) A strategic roadmap that will be deployed for collecting
and analyzing data
c) A definition of both time lines and cost estimates since
most research studies operate under these constraints.
In brief, a research design must contain:
i) A clear statement of the research problem
ii) Procedures and techniques for information gathering from
the sample of research population.
iii) Mathematical algorithms to process and analyze the data.
DEPENDENT AND INDEPENDENT VARIABLE
A concept which cannot stay the same or can take different
quantitative values like weight, height, income etc. is called a
variable. It can be any attributes which qualify on the basis of
its presence or absence.
The quantitative phenomena that can take different values even
in decimal points are called continuous variables. Some of the
phenomena can only be expressed in integer values called
discrete variables, for example, age of children is continuous
variable whereas number of children is discrete variable.
If one variable depends upon the other variable, it is called
dependent variable and the variable that is antecedent to the
dependent variable is called independent variable. For example
if we say that height depends upon age and sex then height is a
dependent variable and age and sex is an independent variable.
EXTRANEOUS VARIABLE
Some independent variables that are related to the research
study but may affect the dependent variable are called
extraneous variables. In any investigation if the effect is
noticed, it is technically termed as experimental error. It is
advised that a study must be designed in such a manner that
the effect upon the dependent variable is attributed entirely to
the independent variables not to extraneous variable.
CONTROL
In experimental researches, when we design the study
minimizing the influence of extraneous independent variables,
which is termed as control. Control is important characteristic
of a good research.
CONFOUNDED RELATIONSHIP
In experimental research, the dependent variable is not free
from effect of extraneous variables. This type of relationship
between dependent and independent variables is called
confound relationship.
EXPERIMENTAL AND NONEXPERIMENTAL HYPOTHESISTESTING
RESEARCH.
If the purpose of research is to test a research hypothesis, it is
called hypothesis testing research. It can be of experimental or
nonexperimental in nature. Experimental hypothesis testing
research is one in which the independent variable is
manipulated. If it is not possible it is called nonexperimental
hypothesistesting research. In case, experimental testing is not
possible then researcher conducts what is known as non
experimental hypothesis testing research. An example of this is,
suppose a researcher wants to study whether height of an
athlete affects the performance at polevault competition. For
this purpose, he can randomly select, say 30 athletes and
measure their height and their performance at polevault
competition. He can then draw conclusion about the hypothesis
by calculating the coefficient of correlation between the height
of the athlete and the height of the vault. This is a example of
nonexperimental hypothesis testing research because research
is not manipulated in the independent variable (in this case
intelligence). But if the researcher has selected 60 students
randomly and divides them into two groups (A&B), group A has
been given orientation programme prior to examination. Then
performance of both the groups at examination has been
compared. This is the example of experimental hypothesis
testing research because researcher has manipulated the
independent variable, (i.e., type of orientation programme).
EXPERIMENTAL AND CONTROL GROUPS
In any experimental hypothesis testing research when a group
is exposed to usual condition (as in above illustration Group B)
it is called as Control Group, if the group is exposed to some
special condition it is termed as experimental group (group A).
Researcher can design his studies which include only
experimental group or include both experimental and control
groups.
TREATMENTS
The different conditions under which experimental and control
groups are put are called treatments.
EXPERIMENTS
The process of exploring the fact of a statistical hypothesis to
solve a research problem, is called an experiment. Further
details are provided in the next Chapter on Design of
Experiments.
EXPERIMENTAL UNITS
In any research, researcher has to define the predetermined
block very carefully where different treatments are used. These
blocks are called experimental units.
Activity 4.1
Watch video Basic Research Designs from teachertube.com and list
several feature of good research design
(www.teachertube.com Basic Research Designs)
Learn how to define dependent and independent variables from web
resource. List 5 good ideas.
(Web resource: www.sciencebuddies.org Dependent Variable)
What is the concept of experimental error? What are Type I and Type
II errors?
(Web resource: www.experimentresources.com > Statistics Tutorial )
To understand about control groups, watch video Aschs Conformity
Experiment Briefly list the learnings
(www.youtube.com >Aschs Conformity Experiment)
SIGNIFICANCE OF A RESEARCH DESIGN
Research design provides the researcher opportunity to
undertake various research operations smoothly. It makes
research as efficient as possible generating maximum
information with minimum effort, time and money. It is like
blueprint which we need in advance to plan the methods to be
adopted for collecting the relevant data and techniques to be
used in its analysis for preparation of research project.
Researcher has to take all necessary precautions in preparation
of the research design, as any error may upset the entire
project. The reliability of result, which a researcher is looking, is
directly related with research design which constitutes a firm
foundation of entire body of research work.
Activity 4.2
Read Research Methodology: Part 3 Research Design and
Planning. Write a brief note on significance of research design
(www.scribd.com > Research> Business & Economics)
FEATURES OF A GOOD RESEARCH DESIGN
The following are the main features of a good research design.
a. Simplicity: It should be simple and understandable
b. Economical: It must be economical. The technique
selected must be cost effective and less timeconsuming
c. Reliability: It should give the smallest experimental
error. This should have the minimum bias and have the
reliability of data collected and analysed.
d. Workability: It must be workable. It should be
pragmatic and practicable.
e. Flexibility: It must be flexible enough to permit the
consideration of many different aspects of a phenomenon.
f. Accuracy: It must lead to accurate results
Activity 4.3
Some researchers make a grandiose research design, which is
not practical. Please comment on workability of a research
design in this context.
STEPS IN A RESEARCH DESIGN
Stated in simple language, a research design is a plan of action,
a plan for collecting and anlysing data in an economic, efficient
and relevant manner. Whatever be the nature of design, the
following steps are generally followed.
 Selection and Definition of a problem: The problem
selected for study should be defined clearly in operational
terms so that researcher knows positively what facts he is
looking for and hat is relevant to the study.
 Source of Data: Once the problem is selected it is the
duty of the researcher to state clearly the various sources
of information such as library, personal documents, field
work, a particular residential group etc.
 Nature of Study: The research design should be
expressed in relation to the nature of study to be
undertaken. The choice of the statistical, experimental or
comparative type of study should be made at this stage
so that the following steps in planning may have
relevance to the proposed problem.
 Object of Study: Whether the design aims at theoretical
understanding or presupposes a welfare notion must be
explicit at this point. Stating the object of the study helps
not only in clarity of the design but also in a sincere
response from the respondents.
 SocialCultural Context: The research design must be
set in the socialcultural context. For example in a study
of the fertility rate in a people of backward class the
context of the socalled backward class of people and the
conceptual reference must be made clear. Unless the
meaning of the term is clearly defined there tends to be a
large variation in the study because the term backward
could have religious, economic and political connotations.
 Temporal context: The geographical limit of the design
should also be referred to at this stage that research
related to be hypothesis is applicable to particular social
group only.
 Dimension: It is physically impossible to analyze the
data collected from a large universe. Hence the selection
of an adequate and representative sample is a byword in
any research.
 Basis of Selection: The mechanics of drawing a random,
stratified, and purposive, double cluster or quota sample
when followed carefully with produce a scientifically valid
sample in an unbiased manner.
 Technique of Data Collection: relevant to the study
design a suitable technique has to be adopted for the
collection of required data. The relative merit of
observation, interview and questionnaire, when studied
together will help in the choice of suitable technique.
Once the collecting of data is complete, analysis, coding
and presentation of the report naturally follow.
Activity 4.4
Read from web resource The SocioCultural Context of
Research and enumerate 6 important ideas.
(www.qem.org/PBLsociocult.ppt.pdf)
What technique of data collection you will use to
A] Determine traffic count at a busy crossing
B] Determining High Paying Customers in a Mall
C] Finding varieties of flower in a Garden
OBJECTIVITY IN A RESEARCH DESIGN
CONCEPTS OF OBJECTIVITY
A study in which the subject matter is the centre of attention
and prejudices are given no place is known objective study.
Objectivity is scientific observation, collection and analysis of
data without prejudices and attachments. In an objective
study the subject matter is observed and described as it is
without exaggeration or diminution. In an objective study the
investigator should use only his sense organs and brain and
not feelings and beliefs. Even if his feelings may be against
the results of an investigation, he should accept it if they
have been achieved through scientific procedure. Reasoning
and intellect are most important than belief and faith in an
objective study. It gives priority to fact as against fictions.
NEED OF OBJECTIVITY
Objectivity is a must in order to arrive at general and
universal conclusions in a scientific study. The following
points highlight the need of objectivity.
a. To make research design scientific: to make the
research design scientific it is essential that
questionnaires, schedules and statistical methods and
metric scales are used.
b. To make scientific conclusions: Every research aims at
scientific conclusions. Scientific conclusions are not
influenced by imaginations, feelings, prejudices and
impressions etc. They are based on facts and reasoning.
c. To achieve representative Facts: In order to get
representative samples from objectivity point of view, it is
essential that the sample must be selected scientifically.
d. For verification: Verification is a necessary condition in
a scientific study of the facts and conditions. The
conclusions must be verified by repeated studies of the
identical phenomenon through identical method in
identical circumstances. These repeated studies eliminate
the elements of errors in conclusion so that they now
assume the form of principle or law. Objectivity is a must
for verification of the results.
e. For actual study of the phenomenon: Objectivity is a
must for actual study of the phenomenon. If the
investigator has an objective attitude he may arrive at
conclusions which may be universally accepted.
f. In order to know the possibilities of fresh research:
Objective study shows new possibilities of research. The
objective study aims at discovery of unknown facts. It
helps in finding out aspects which may be explored
through independent research
Activity 4.5
How elements of error are removed by repeated study of a
phenomenon?
Can some facts be universally acceptable? Give few example.
DIFFICULTIES IN OBJECTIVE RESEARCH
The following are the main difficulties encountered in Objective
Research
a. Difficulty of detachment of the investigator: The
biggest problem before the investigator is to keep him
detached from the subject of study. These are
impediments in keeping the research study objective and
scientific.
b. Influence of popular Notions: The current notions of
the business may act as impediments in the objectivity of
research study.
c. Fallacy of particularity: An impediment in the
objectivity of research studies is the fallacy of
particularity. For example to say that the only one cause
of indiscipline among workers is the union activism is to
commit this fallacy since union activism is only one factor
in indiscipline and it may not be applicable in the case of
every worker.
d. Confusion of General Knowledge with actual
knowledge: An objective study aims at actual knowledge
and not general knowledge. Sometimes an investigator
confuses the general and current knowledge with real
knowledge. This dependence over general knowledge is a
serious impediment in the objectivity of the study.
e. Possibility of contradictory prejudices: In order to
decide over a managerial issue one may be either in
favour or against it. This, however, makes the attitude
prejudicial.
f. Ethnocentrism: it means favouring of the race, caste
group, society, community, religion, culture, language
and literature of which one is a member and follower.
Thus ethnocentrism is a serious prejudice in ones own
favour and against others.
g. External pressures: External pressures seriously
interfere with the objectivity of research studies.
h. Personal Interests of the investigator: If the interests
of the investigator are in some way connected with the
problem of study, there is no possibility of an objectivity.
If the collection and publication of the fact may harm his
interests, the investigator will make all possible efforts to
restrict their collection and publication.
i. Absence of quick judgment: In some studies there is
an urgent need of quick judgment. However, in such
situations the judgment is hardly objective, and becomes
one of the impediment.
 Attitude and prejudices: The attitudes and prejudices
are the most important impediments in objectivity of the
study. While there is no fear in adopting objective attitude
towards physical phenomenon, one has to face several
types of fears in adopting an objective attitude in social
studies. This fear may be due to the state, clan, the
family or the group and is likely to create adverse
influences, which are impediments in the objectivity of
study.
Activity 4.6
Study the web resource Particularity Assumptions .Describe
the underlying idea of such assumption
MEANS OF OBJECTIVITY OR METHODS OF ACHIEVING
OBJECTIVITY
The following means may be useful in achieving objectivity in
research studies.
a. Use of experimental Methods: Experimental method
helps in achieving objectivity in study. The statistical and
quantitative techniques are used to arrive at conclusions.
These methods have no place for personal ideas, feelings,
ideals, values and impressions. The experimental method
is universal and democratic. It involves verification and
reverification. Therefore, it is a useful means to arrive at
objectivity in research studies.
b. Standardization of Technical terms and concepts: In
order to achieve objectivity in research studies, the
terms, concepts used must be definite and clear. No term
should be used in more than one sense. In order to
facilitate this process, each technical term and concept
should be rendered definite and clear before its actual
use. This standardization of terms and concepts will
provide objectivity to research studies.
c. Use of questionnaires and schedules: Questionnaires
and schedules are used in research studies to avoid
personal prejudices and bias. These are considered as one
of the best methods of observation. However, these
methods have also some serious limitations.
d. Use of Random Sampling: The objectivity of research
studies very much depends upon selection of
representative samples. If the investigator selects the
samples according to his own inclination they are
generally defective. In order to avoid this problem,
random sampling is widely used in research studies.
e. Use of Group Instigation Techniques: Most of the
defects in research studies are due to personal and
subjective factors of the investigator. In order to
eliminate this element of error, a research problem is now
studied not by one investigator, but by a group of
investigators. This technique of group study eliminates
the personal factors connected with different
investigators.
f. Use of Control methods: There is a positive correlation
between the amount of control and the objectivity of
methods in social sciences.
g. Use of Mechanical Tools: In order to eliminate the
personality or mental factors and to remove the source of
error it is useful to utilize more mechanical tools such as
typewriter, taperecorder, photo cameras etc.
h. Use of Inter Disciplinary method: Besides, mental or
psychological, the research problems have economic
political, religious, cultural and so many other aspects. As
well since all these aspects are interrelated, it is
necessary to know all these in order to arrive at a
complete knowledge of problem. For this purpose inter
disciplinary method is used. If the interdisciplinary
method is faithfully followed, it leads to high amount of
objectivity in research studies.
Activity 4.7
Objectivity gets clouded when personal interest of researcher
gets entwined with objectives of research?
Comment.
..
Cite atleast 2 examples of such conflict situation.
..
In todays world mechanical tools are being supplemented by
tools of information technology? Comment.
Where can be following devices be used for research?
A] Satellites
B] Mobile Phones
SOURCES OF PREJUDICES AND BIASES
Following is the systematic list of biases likely to occur in
research design;
a. Bias of observer: While observing any phenomenon we
are liable to concentrate on some facts and miss or ignore
others due to our builtin value system, preconceptions,
interest and sentiments. But a good observer must see
facts as they are and not as he wishes to see them. He
must see all relevant details and suspend his personal
judgment while he is observing.
b. Bias of Informers: In design research, a researcher ahs
to collect facts by making queries from people. They
usually avoid answers which are likely to be controversial.
Sometimes they supply wrong answers because they do
not quite understand the question. In order to avoid these
pitfalls, a research must be able to approach his
informers, tactfully, create confidence in them and make
them realize the value of their cooperation.
c. Bias due to Sample: A research can be meaningful and
useful only if we choose a representative sample for
investigation. A haphazard and careless choice of samples
can seriously prejudice the cause of research.
d. Defective Questionnaire: The quality of answer is
dependent upon the quality of questions. If questions are
ambiguous and capable of being understood variously the
answers will be indefinite and unreliable. Therefore while
formulating a questionnaire; adequate care should be
taken to include only unambiguous and clear questions.
e. Defective Data Collection: The validity of research is
determined by the validity and correctness of data
collected. Only trained workers are capable of avoiding
errors in the collection of data.
f. Defective Analysis: After collection of data, appropriate
analysis is the next requisite. Only if the analysis is
proper there can be hope for valid conclusions. In this
connection special attention must be paid to proper
classification of data.
g. Defective Generalization: Having analyzed the data,
valid generalization becomes possible. Personal bias in
any form can vitiate the conclusions; hence one must be
completely objective in deriving conclusions from the
facts before oneself.
h. Sentiment Factors: While the research in physical
sciences hardly produces any emotional reverberations in
the scientists, the observer of social events is emotionally
affected by them. By training only such prejudices can be
minimized.
i. Common Sense Bias: Usually have a number of
common sense notions regarding social facts, events and
process. These builtin notions in our mind many a times
prevent us from perceiving the scientific truth. Builtin
prejudices in our mind close our minds to fresh ideas. By
training of mind only it is possible to reject what does not
fit in with its existing system of ideas.
j. Bias Due to Attitude and Aptitudes: We see things in
the light of our own point of view. We may very well miss
what we do not want see. A well trained researcher keeps
his judgment always suspended and even when certain
findings are highly unpalatable to him personally he
refuses to distort the facts.
k. Time factors in Research: The general maxim that
hurry makes worry and haste makes waste applies very
much to research. If under the pressure of practical need
or lack of time a researcher formulates his basic
hypothesis in a hurry and does not devote adequate time
to data collection, analysis etc, his conclusions will not be
very reliable as the probability of errors in his source
remains. As a matter of fact undue hurry as well as undue
delay is prejudicial to a good research.
Activity 4.8
(i) Can you relate any idea from celebrated book Pride and
Prejudice to concept of Biases and Prejudices in Research
(Book Resource: Pride and Prejudice by Jane Austin)
(ii) How defective questionnaires make research unreliable?
(iii) In social Research sentimental factors can make bias
creep in? Cite couple of examples.
(iv) Longitudinal Research runs over a long period of time.
What are other characteristics of longitudinal Research?
(www.socialresearchmethods.net/tutorial/Cho2/Cho1.html)
TYPES/FORMS OF A RESEARCH DESIGN
Research design may be for the convenience of study,
categories and the following heads:
i. Exploratory or formulative design
ii. Descriptive or diagnostic design
iii. Experimental design
EXPLORATORY OR FORMULATIVE DESIGN
Another name for exploratory research is formulative research
studies. Since the basic purpose of such studies is to discover
new ideas or insights, research design in such studies should be
more flexible. The need for such flexibility is because in
exploratory or formulative research the problem is loosely
defined as it is the process of exploratory studies that would
lead to more focus on the specific research area. Generally, the
formulative or exploratory research leads to a more precise
definition of the research problem and at times also necessitates
changes in research methodology and paradigm for data
collection.
Activity 4.9
List 5 Applications of exploratory Research for Business
Applications
CHARACTERISTICS OF EXPLORATORY DESIGN
The exploratory design must process the following
characteristics.
a. Business Significance: Unless the problem has a place
in the industry or has business significance, its study shall
be useless and meaningless.
b. Practical Aspect: If should be of practical value to the
management. If it has not practical value it shall be
useless for business decisions.
c. Combination of Theory: Mere practical significance of
the problem has no meaning unless it is based on theory.
If a particular problem is based on certain theoretical
aspects it shall be possible for the researcher to judge its
utility or proceed with his study in the right direction.
d. Reliable and valuable facts: In the absence of reliable
and valuable facts, the study of the problem shall be no
managerial significance.
ROLE/SIGNIFICANCE OF EXPLORATORY DESIGN
Its role can be emphasized owing to following aspects:
a. Information about the immediate conditions: The
design provides information about the conditions of the
problem. When the investigator does not have resources
and capability to test the hypothesis he is able to find
facts through exploratory design which is suitable to or in
accordance with the hypothesis.
b. Presentations of Important Problems: Through
exploratory and formulative designs, it is possible to
present important research problems. Once the problems
have been presented, the investigator is automatically
attracted towards the study of the problem that has
greater importance for our society.
c. Study of the unknown fields: For research, theory or
hypothesis is inevitable. They provide proper basis. In
order to formulate a hypothesis, we have to acquire the
relevant information and through exploratory design this
task is achieved.
d. Theoretical Base: The research problem deals with our
social life and social problems and data about them can
only be collected through exploratory design. This design
is helpful in providing a theoretical base to the hypothesis
and theories.
e. Presentation of uncertain problem for study in
research: Through exploratory designs we are able to
determine these problems. This method on the one hand,
focuses the attention of the investigator on the problem
and, on the other, it helps him to collect facts on scientific
lines so that research may be carried out correctly.
Activity 4.10
Nanotechnology is a relatively unknown field. Watch video
Nanoscale Exploratory Research Lab. See how exploratory
research can be applied to study unknown fileds.
(www.youtube.com NETLibm)
METHODS OF EXPLORATORY RESEARCH DESIGN
Generally the following three methods are adopted:
a. Review/Survey of the Concerned Literature: When
the investigator proceeds on the path of research he has
to take advantage of his predecessors. He has also to
take help from what has already been done. This would
save him from the trouble of trial and error, and also
economize his efforts. There are various hypothesis
available. From these hypotheses, the investigator has to
select those that are useful to him. Review and perusal of
pertinent literature is very useful to him. Apart from
literature directly connected with the problem, the
literature that is connected with similar problems is also
useful for him. It helps in the formulation of the problem
in a clearcut manner.
b. Experience Survey: As the name suggests, it implies
seeking experiential learning from the people who have
practically dealt with the research problem at hand. It is
important that for such a survey only such people who
are well versed in the area of research and have mental
ability to contribute new ideas and suggestions are
selected as respondents. It is important as objective of
such a survey is to find new insights between the
research variables and the new ideas that may impinge
on solution of such a problem. The respondents who are
selected for such a survey are interviewed by the
researcher on the basis of a carefully prepared interview
schedule. Therefore, experience survey enables an
investigator to define the research problem more
concisely and also assists in hypothesis formulation.
c. Analysis of insight stimulating cases: Case based
research decision is particularly needed for such research
areas where there is very little prior experience. The
approach that is needed for such methodology is to put
together test cases which embody those events,
incidences and phenomena that have direct bearings on
the research problem. Such cases can be put together
through a process of examining records, unstructured
interviewing and researchers observations. Therefore, it
is important that the research has the aptitude, attitude
and insight to put together diverse information into a case
which can be further interpreted in order to provide
bearings and solutions of the research problem.
Activity 4.11
Read about use of experience survey in Postgraduate
Experience of education. Comment
FACTORS AFFECTING THE ANALYSIS OF INSIGHT STIMULATING
CASES
The following are the important factors which affect the
study:
a. Attitude of the investigator: in the study of the case or
the case study attitude of the investigator is very
important. If the attitude of the investigator is receptive
and he is sensitive to various developments that take
place in the field of his study he is able to make a steady
progress. As a result of his receptivity he does not
concentrate merely on the available data, but also take
notes of the peculiarities and specialties of the subject of
the study or case study. As a result of this study
continues to change according to available information.
Because of new hypothesis the investigator also changes
the standards and the measurements of the case study
selection and collection of the material.
b. Intensity of the case study: The next characteristic of
the case study is the intensity of its study. Under case
study method, the subject matter is studied in all its
dimensions and ramifications. Such a study is not limited
to the present alone, but is done in the background of the
historical business/industry/firms background.
c. Integrative Powers of the Investigator: The third
factor that stimulates the investigator is his integrative
power. On this basis he collects even the minutest
possible information about the subject matter.
MOTIVATING EVENTS FOR INVESTIGATORS
Generally, the following types of events are able to stimulate
the investigator:
a. Reaction of strangers: When we want to study the
characteristics of a particular group or community, the
reactions of the strangers are very valuable. Reactions of
the strangers throw more light on the characteristics of a
community and present more reliable and dependable
picture of the society.
b. Transitional Cases: Such problems that have
transitional nature or those problems that continue to
progress from one stage to the other are very stimulating
for the investigator.
c. Landmark cases: The landmark cases in fact are
responsible for industry development and they are bound
to be stimulating and interesting for the investigator.
d. Characteristics of the industry: The characterics of the
industry throw light on the industrial conditions. Apart
from it, they are also of interest and insight stimulating
for the investigator.
e. Position in different industry structures: In an
industry there are firms that represent different stages of
maturity. These organization represent the out look and
the characteristics of these industry structures, and
stimulate the investigator to conduct the research.
Activity 4.12
In legal research landmark cases or judgments influence a
researcher to great degree. Comment
What is the role of reaction of strangers in simulating an
investigator?
Can this sometimes counterproductive?
DESCRIPTIVE AND DIAGNOSTIC DESIGN
Descriptive research studies are those studies which make
specific predictions regarding the outcome of research issue
along with the descriptions of facts and other characteristics
concerning the research sample or situations.
On the other hand, diagnostic research studies look into the fact
whether certain research variables have a causal association
with other phenomena. In brief, diagnostic are based on cause
and effect studies.
Activity 4.13
What fields of research are amenable to
A] Descriptive Design
B] Diagnostic Design
DIFFERENCE BETWEEN DESCRIPTIVE AND DIAGNOSTIC DESIGNS
The difference between the two studies lies mainly in regard to
the following:
a. Difference in field: Basically the diagnostic design of
research is concerned with the express characteristics and
existing social problems. It tries to find out relationship
between the express cause and presents a diagnostic
action. Contrary to this the descriptive designs are
concerned only with the existing or present forms of the
problem. It begins with the research about past form of
the problem. It does not bother about the diagnostic
approach or activities. Diagnostic designs study the real
form of the problems and they also suggest ways and
means for their solution.
b. Difference in Hypothesis: Diagnostic designs are
entirely motivated by the hypothesis. On the other hand,
the descriptive designs are not entirely motivated and
hypothesis is formulated on the basis of the description of
the existing data or material. On the other hand, in
diagnostic designs the material is collected on the basis of
the formulated design.
c. Difference in Objective: From the point of view of
objectives, Diagnostic design is based on such knowledge
which can also motivate or put into practice in the
solution of the problem. Thus the diagnostic design is
equally concerned with the case as well as the treatment.
But in descriptive design the main object is to acquire
knowledge. It has nothing to do with the solution of the
problem or treatment of the causes.
d. Treatment: Under the diagnostic design immediate or
timely solution of the cause elements or cause is
necessary. The investigator, before studying other
references, tries to remove and solve the factors and the
causes that have given rise to the problem. It is the main
cause that brings about the development of the problem
that are studied by the investigator in the diagnostic
design. But under descriptive designs, no attempt is
made to solve or modify the existing or present causes of
the problem.
e. Difference in the field expansion: There are certain
fields where the knowledge about the problem has not
been developed properly. In such circumstances,
descriptive designs are more useful and effective as
compared to diagnostic designs. In spite of these
differences, descriptive and diagnostic designs are very
intimately related to one another. Whenever a problem is
to be studied, characteristics and elements of both these
designs have to be taken into account and put to use. In
other words, it may be said that the difference is only for
convenience of study. In actual practice, both these
designs have a contribution, and their elements have to
go together.
Activity 4.14
Is medical research largely Diagnostic type!
Discuss with few examples.
Watch Videos [www.youtube.com]
A] Philosophy of Experimental Design
B] Foundation notes on Experimental Design
List the high points described in videos
EXPERIMENTAL DESIGNS
Experimental design implies the options in design the
research experiments to lead to possible outcomes on the
research problem, which can lead to possible solution of the
research problem. Experimental design is generally classified
into two categories:
i) Informal experimental design
ii) Formal experimental design
Informal experimental design is generally based on thumb
rule rather than extensive scientific analysis. On the other
hand, formal experimental design needs precise statistical
procedure for both design and analysis. In one of the later
Chapters, the concept of experimental design is discussed in
detail.
SUMMARY
In this Chapter the dentitions of Research Design, Dependent
and independent Variables extraneous variables, control,
confounded Relationship, Research Hypothesis, Experimental
and non experimental hypothesis testing research,
experimental and control Groups, treatments, experiment,
and experimental unit, have been explained. The need or
significance of research design has been highlighted.
Features of a good research design have been included.
Steps or procedure, of a research design have been
described. Problem of objectivity and its need in research
design, has been incorporated. The difficulties in objective
research and means of achieving objectivity in research
design have been explained. The various types of research
design have been described. Exploratory or formulative
design, its characteristics and significance have been
discussed. The methods of exploratory design, i.e.,
Review/Survey of Literature, Experience Survey, Analysis of
insight stimulating cases have been explained. Descriptive
and diagnostic design have been discussed. The difference
between descriptive and diagnostic design has been
described. The motivating events for investigation have been
included.
REVIEW QUESTIONS
1. Explain the following terms.
i. Research Design,
ii. Elements of Research Design
iii. Dependent and independent variables
iv. Extraneous variables,
v. Control
vi. Confounded relationship
vii. Research hypothesis
viii. Experimental and control group
ix. Treatments
x. Experiment
xi. Experimental unit
2. What is research design? Explain its importance
3. Explain the features of a good research design and
describe the steps of conducting a research design.
4. What is objectivity? State its need in Research design
5. Explain the difficulties of objective Research. How
objectivity can be achieved in a research design.
6. Explain the sources of prejudice and bias in a research
design
7. Explain the exploratory or formulative research design
and describe its characteristics
8. Distinguish between exploratory and descriptive research
design
9. Distinguish between experimental and non experimental
research design
10.Explain the role of exploratory research design
11.Describe the methods of exploratory research design
12.Differentiate between experience survey and analysis of
insight stimulating cases.
13.Describe the motivating events for the investigators
14.Explain the difference between exploratory design and
descriptive and diagnostic designs.
FURTHER READINGS
 Cresswell, John, W.,(2008). Research Design; Qualitative,
Quantitative and Mixed Methods Approaches. Newbury
Park, CA: Sage Publication.
 Marczyk, G.R, DeMatteo, D. & Festinger D., (2005).
Essentials of Research Design and methodology, New
York City, NY: Wiley.
 Ethridge, Don E., (2004). Research Methodology in
Applied Economics. Daryaganj, ND: Wiley Blackwell,
 Bergh, D. and Ketchen, D. (2009) Research Methodology
in Strategy and Mangement. Binglay, UK: Emarald Group
Publishing.
CHAPTER 5 EXPERIMENTAL DESIGN
Objectives
After reading this chapter the learner will be able to:
 Understand the fundamentals of experimental design and
its need for research
 Comprehend the basic principles and steps needed in
planning an experimental design
 Appreciate some important experimental designs
 Understand some problem areas in the concept of
experimental design
Structure
 Fundamentals of Experimental Design
 Need for Experimental Design
 Basic Principles of Experimental Design
 Steps in Planning an Experimental Design
 Important Experimental Designs
 Difficulties in Experimental Designs
 Summary
 Review Questions
 Further Readings
FUNDAMENTALS OF EXPERIMENTAL DESIGN
EXPERIMENT
An experiment may be defined as an observation under
controlled conditions. In broadest meaning, an experiment may
be considered as a way of organizing the collection of an
evidence so as to permit one to make inferences about the
tenability of a hypothesis.
EXPERIMENT DESIGN
Experimental research studies generally require testing of
hypothesis for causal relationship amongst the variables.
Naturally, these types of research studies require procedures
that should not only reduce the bias but also lead to inferences
about causality. This leads to necessity for experimental design.
As briefly discussed in the previous chapter, experimental
design develops a framework of experiments based on thumb
rule or statistical procedures. Details are provided later in this
chapter.
FACTOR
It is variable or attribute which influences or is suspected of
influencing the characteristic being investigated. A factor may
be variable and measurable (e.g. temperature in degrees, time
in seconds) or it may be an attribute whose presence or identity
may be determined (e.g., name of the operator, left or right
hand glove)
LEVEL
The values of a factor being examined in an experiment. If the
factor is an attribute, each of its states is a level. For example, if
a switch is either on or off, switch setting (one factor) has two
levels. If the factor is a variable, the range is divided into two or
more intervals, each of which is then a level. For example of if
the temperature ranges from 800
o
C 950
o
C, and three ranges
are to be used, then 800
o
C 850
o
C would be one level, 850
o
C
900
o
C a second and 900
o
C950
o
C a third.
TREATMENT
One set of levels of all factors employed in a given experimental
trail. For example, an experiment conducted using temperature
T
1
, Machine M
2
, Operator B would constitute one treatment.
EXPERIMENTAL UNIT
Facility for conducting a trial. The experimental units are
allocated to different treatments.
RESPONSE
Numerical result of a trial with a given treatment. Examples are
yield, purity, tensile strength, surface finish, number of
defectives etc.
EFFECT
Effect of a factor is the change in response due to changes in
levels of the factors.
[Type text] Page 144
MAIN EFFECT
Effect of a factor averaged over different levels of the other
factors.
INTERACTION
The difference in effect of one factor when a second factor is
changed from one level to another.
RANDOMIZATION
The process of assigning experimental units to treatments in a
purely chance manner. Purpose is to assure that the sources of
variation not controlled in experiment operate randomly so that
the average effect on any group is zero.
REPLICATION
The execution of an experiment more than once. The purpose is
to increase precision and obtain a better estimate of the
experimental error.
EXPERIMENTAL ERROR
It is the variation in response when the same treatment is
repeated, caused by conditions not controlled in the experiment.
It is estimated as residual variation after the effects of main
factors and interactions have been removed.
Activity 5.1
Appreciate the concept of Level better by going through
Applying Experimental Design
(www.qulityamerica.com/knowledgecentre)
Read about Randomized Controlled Trail in Wikipedia.
Comment on this technique
( www.en.wikipedia.org/RandomizedControlledTrails)
NEED FOR EXPERIMENTAL DESIGN
To achieve the goal of process optimization, to prevent, or to
minimize the occurrence of defective product, a thorough
understanding of the process behavior under different sets of
process conditions is needed. This can be best obtained through
designed or planned experimentation. Planning an experiment
so that conformation relevant to the problem on hand will be
made available is known as: Designing and experiment.
Experience has shown that if the data collection is properly
planned, organized, summarized and interpreted using
statistical principles, one will be able to draw valid and
meaningful conclusions from the results. The design of
experiment was found to be an excellent tool of effecting
engineering development, quality improvement, process
optimization as well as cost reduction.
In general, planned experimentation is necessary to distinguish
between critical factors (which have a dominating) effect and
need to be controlled within the narrow limits and non critical
factors which are insignificant and do not require close control
as well as to identify the optimum levels of the critical factors so
as to achieve significantly improved performance.
Activity 5.2
How does one distinguish between critical factors and non
critical factors in an Experimental Design?
..
Why critical factors need to be controlled within narrow
tolerances?
BASIC PRINCIPLES OF EXPERIMENTAL
DESIGN
a. Principle of replication: under this principle emphasis is
on doing the same experiment more than once.
Researcher applies each treatment in many experimental
units instead of one. By doing so he increases the
statistical accuracy. For example, we can get a more
precise effect of the mean effect of any factor. Since
where is the standard deviation of the
mean, is the true experimental error and n is the
number of replications. The great the value of n smaller is
the value of
b. Principle of randomization: the principle of
randomization provides researcher protection against the
effect of extraneous factor, when he undertakes any
experiment. It provides the freedom of designing and
planning the experiment in such a fashion that variations,
caused by extraneous factors can all be combined
together and termed as chance. For example, to
determine the differences in productivity of different
makes of machines (treatment), we may isolate the
possible effects due to differences in efficiency among
operators (block) by assigning the machines at random to
randomly selected operators.
The basic idea is to compare all treatment effects within a
block of experimental material by eliminating environmental
effects.
Randomization procedure is done with the help of random
number table by following steps:
a. Open the page of the table randomly
b. Select the column of numbers on that page randomly.
c. Numbers in that column will be used in order to
determine the order or rows of the columns to be chosen.
d. Extra numbers will be omitted.
APPLICATION OF RANDOMIZATION OF TREATMENTS
Thus the randomization procedure is appropriate under the
following circumstances.
1. Where the experimental units are homogeneous and is
limited in quantity.
2. Where an sizable fraction of the units is likely to be
destroyed.
3. Wherein small experiments the increased accuracy from
alternative designs does not outweigh the loss of error
degrees of freedom.
Activity 5.3
Read tutorials from web source on Principles of Experimental
Designs? Briefly describe 3 important parameters listed in it.
(www.emathzone.com/tutorials/basicprinciplesofexperimentaldesign.html)
RANDOMIZATION OF TREATMENTS
The treatments are assigned to the units within each group
entirely at random with the help of random number.
a. The principle of local control: The local control may be
defined as the balancing, blocking and grouping of the
experimental units employed in experimental design. By
grouping we mean placing a set of homogeneous
experimental units into groups in order that the different
groups may be subject to different treatments. The
number of experimental units in different groups may not
be necessarily same. By blocking is meant assigning the
experimental units to blocks in a manner so that units in
a block are relatively homogeneous. By balancing is
meant the way of adjusting the above mentioned
procedures. i.e., grouping, blocking and assigning the
treatments in such a manner that a balanced
configuration results.
As an example if we are interested in testing the efficiency of a
vaccine in preventing conception by women. It is not sufficient
to say that in a group of 100 women, the ration of women
conceiving is less for vaccinated group than that among the
nonvaccinated group. The individuals compared should be
similar. If we take two women of the same age, caste and
genetic behavior, say twins and give vaccine to one leaving the
other unvaccinated, it is proper balanced configuration.
[Type text] Page 150
Activity 5.4
Can principle of randomization be applied to a research design
evaluating efficiency of antiHIV drugs among AIDs patients?
Comment and elaborate
Read about principle of local control in web resource. How can
it help in eliminating variability due to extraneous factors?
(http://frepedia.in/BasicPrinciplesofExperimentedDesign)
STEPS IN PLANNING AN EXPERIMENTAL
DESIGN
Planning of an experiment is sine quo non for successful conduct
of the experiment. The following steps are adopted in planning
an experimental design.
a. Selection of the Problem: Every problem cannot be
studied through experimental method. One of the major
conditions being the capacity to manipulate the
independent variable the effect of which is decided to be
studied. For instance, study of various advertising
techniques, training methods, effect of group decision,
political propaganda are some of the illustrations of
problems that have been studied through experimental
design.
b. Proper Description of the selected problem: After
selecting the problem it must be put in proper language,
i.e., the hypothesis must be stated in clear and
conceptual terms. The variables that affect the
phenomenon must be known and conceptualized. The
fundamental causative factor or the independent variable
must also be decided and the plans for its gradual
manipulation must be clearly determined.
c. Selecting the settings: The background in which the
experiment relating to phenomenon is to be carried out is
termed as setting. In case of laboratory experiment it is
created artificially and the experimenter decides how it
can be done. In case of a field experiment, natural setting
has to be located where the experiment can be made.
d. Pilot Study: In planning an experiment, a pilot study
may be necessary so that the researcher is brought face
to face with realities and many problems that he had not
thought of. This also will enable him to know more
precisely the various causative factors involved, the
nature and working of the institution, the extent of co
operation or resistance that he is expected to meet. A
researcher becomes prepared, through this, to discuss
the plan of experiment, and its object with the key
persons to seek their cooperation.
e. Research Design: The most vital part of the research is
research design as it lays down the manner in which the
researcher will manipulate the situation in order to study
the desired effect. This in itself leads to problem or
control over the phenomenon. For a successful
experiment it is necessary that the major independent
variable should be varied gradually while all other
causative factors should remain unchanged. The following
methods are used to avoid problems in research design:
Activity 5.5
Attempt a proper description of following problem:
Due to recessionary trends the shopping malls are in trouble
Frame a properly worded hypothesis statement:
[Type text] Page 153
Watch video [Spiritual and Medicine Pilot Research Study of
NIMH] Did pilot study as showcased lead to satisfactory design?
(www.youtube.com spiritual & medicine pilot study research NIMH)
i. Use of control groups: With a view to avoiding the
effects of confounding factors, more than one similar
group may be selected. The one group in which the
stimulus will be applied will be known as experimental
group while the other where no such stimulus will be
provided would be known as the control group. For
example, if we want to study the effect of weekly test
upon the examination results and general standard or
knowledge among the students we shall select to groups
of students. One group will be subjected to weekly tests
and would be known as experimental group while in
case of the other no weekly tests shall be held. The
difference in the result would be treated as caused by
conducting weekly tests. In such cases, it is essential
that other factors influencing the result viz. type of
papers, general atmosphere of the college must remain
unchanged
ii. Control through measurement: This can be exercised
only when all the causative factors and their effect is
known to the experimenter. When he begins to study,
he will locate the causative factors at work and measure
their degree and at the end of the experiment he will
measure these factors again and thus commute the
change in the result affected by them. Thus an
experimenter can judge the effect of the stimulus
without seeking any help from the control group. The
extent of the success of this method will depend upon
the knowledge of such causative factors and correct
measurement of their effect. Due to its inherent defects,
this method is not generally used.
iii. Replication: When this method is applied, the
experiment is repeated in the same setting after some
duration and this can be successful only if the passage
of time has not introduced any major change in the
nature and composition of the subject. Repetitive
surveys have been more popular than the repetitive
experiments. It is opined that if laboratory experiments
are repeated, but only under similar conditions, they
can provide help in confirming the influences of earlier
experiment.
iv. Insulation: There is yet another way of controlling the
phenomenon and this is through sealing it off from other
groups. Complete insulation is impossible; however,
free mixing can be avoided with much better results.
v. Problem of cooperation: It is absolutely essential to
have cooperation from the people under investigation
and those in charge of the setting for successful
discharge of the experimentation. For instance, in a
study regarding the effect of labour participation in
management upon productivity, it is essential that the
management should be in favour of such an experiment
and also the labour must cooperate with the study. In
case when cooperation is available from one side, it may
be wanting from the other side and the study would be
hampered by it. The extent of the cooperation,
however, primarily depends upon the experimenters
personal influence, interest of the management, nature
of dislocation to be caused, the extent of benefit likely
to accrue, the power of the experimenter to gain
confidence of different parties concerned with the study
etc.
Activity 5.6
Control through measurement in a technique used in Chemical
Industry for process control. Comment on its usefulness in
Management Research
IMPORTANT EXPERIMENTAL DESIGNS
Experimental design is the basic framework or structure of an
experiment on which the whole research work is focused. There
are two broad classification of experimental design: formal
experimental designs and informal experimental designs. The
formal experimental designs offer the researcher more control
and use of precise statistical procedures for analysis of the study
where as informal experimental designs normally use less
sophisticated form of statistical procedures for analysis. The
important experimental designs are as follows:
i. Informal experimental designs
a. Beforeandwithout control design
b. Afteronly with control design
c. Beforeandafter with control design
ii. Formal experimental designs
a. Completely randomized design (CR Design)
b. Randomized block design (RB Design)
c. Latin square design (LS Design)
d. Factorial designs
INFORMAL EXPERIMENTAL DESIGN
Beforeandwithout control design
In such an experimental design, a set of single test group is
selected and the dependent variable is measured prior to
application of a specific treatment. Subsequently, treatment is
introduced and dependent variable is again measured.
Therefore, the interpretation would be that treatment produced
the delta () difference in the outcome of dependent variable.
An example of this can be, say, to observe the level of bacteria
[Type text] Page 157
in a public swimming pool, prior and after the chlorination
treatment.
The main difficulty in such a design is that there could be other
extraneous variations while the treatment is being introduced. If
we continue with the above example, it can so happen that
while chlorination treatment is being applied there is a rain fall,
which adds air borne bacteria with rain water into the swimming
pool.
AFTERONLY WITH CONTROL DESIGN
In this type of experimental design, two areas viz, test area and
control area, are selected. In such a design, the treatment is
applied only to the test area. The dependent variable is
measured in both the areas at the same time. This leads to
possible elimination of extraneous variations. The impact of
treatment is assessed by subtracting the value of dependent
variable in the control area from the value obtained in the test
area.
For example, there are two adjacent fields of a farmer of equal
size. In one field, fertilizer is put (test field) and in the other
field no fertilizer is applied (control field). After one month, the
growth of crop is measured in both the fields. So, it can be
deduced that, fertilizer leads to increase by 3 cm if the average
height of crop is 12 cm in test field and 9 cm in control field.
Other extraneous factors such as water, rain fall, and climatic
conditions are common to both. Therefore, it can be said that
this experiment design is superior to before and after, without
control design.
[Type text] Page 158
BEFOREANDAFTER WITH CONTROL DESIGN
This design, in a way, is an improvement on the first design and
also combines control features of the second design.
Again in this experimental design two areas [test and control
areas] are selected and dependent variable is measured in both
for common time period prior to the treatment. Then, the
treatment is applied only in the test area and the dependent
variable is measured again in both the test and control areas for
an identical time period after the introduction of treatment. The
impact of treatment is determined by subtracting the delta
change in the dependent variable obtained in the control area
from the delta change achieved in the dependent variable in the
test area.
This design is superior to earlier two design because not only it
avoids the extraneous variations but also the variations of non
comparability of the test and control areas.
Activity 5.7
List major differences between the various types of informal
experimented designs
FORMAL EXPERIMENTAL DESIGNS
Completely randomized design
This type of design involves the principle of replication and
principle of randomization. In a sense this is the easiest possible
experimental design and therefore as procedure of analysis is
also simpler. The basic characteristics of a completely
randomized design is that subjects are randomly assigned to
experimental treatments. For example if we have 8 patients and
we wish to give medication to four, on the basis of treatment A
and other four under treatment B the Randomization process
provides the possible opportunity that the group of four patients
be selected from a set of eight and being treated by treatment A
and treatment B. Analysis procedure required to analyze such
design called one way analysis of variance. This design provides
the greatest number o degrees of freedom to the error.
Normally this design is used when experimental areas are
homogenous. Strictly speaking when all possible variation due
to uncontrollable experimental factors is included under chance
variation, the design of experiment is known as completely
randomized design.
Activity 5.8
Why Beforeandafter with control design superior to other type
of informal experimental designs?
ADVANTAGES OF COMPLETELY RANDOMIZED DESIGN
This design has following advantages.
a. Complete flexibility is possible. The numbers of
replications can be varied at will from treatment to
treatment. It is possible to utilize all the experimental
data.
b. Statistical analysis is easy even if number of replication is
not same for all treatment.
c. The analysis remains simple even when results from
some units or treatments are rejected. The relative loss of
information due to such rejection is smallest compared
with any other design.
RANDOMIZED BLOCK DESIGN
This is the most familiar use and a very important design among
all experimental designs. Apart from completely randomized
design simplest design to construct and analysis is known as
randomized block design. The term randomized block emanated
from agronomic research wherein several variables or
treatments are applied to different blocks of land to study the
effect of replication on experimental effort, such as, yield of
different types of sugarcane by using variable amounts of water
to irrigate the fields. However, difference in sugarcane yield may
not be attributed only to the different strains of sugar cane but
also to difference in fertility of soil in the various blocks of lands.
To remove the block effect; randomization is obtained by
providing treatments at random to blocks of land. In such cases,
blocks are formed in a way and each contains as many plots as
there are treatments to be experimented with. And one plot
from each is randomly selected for each treatment. The scheme
is easily understood by looking at it, as a field planning of an
agronomic experiment. For example, in case of 4 treatments
(ABCD) in six blocks of 4 plots, the arrangement can be
illustrated as below.
A D C B B D
B C A D D A
B D C D C B
C A A B A D
The analysis of variance table for a randomized block design will
be as follows:
Source of variation Sum of Squares Degrees of
freedom
Mean square
Column treatment SSC (C1) MSC=SSG/C1
Row treatment SSR (r1) MSR = SSR/r1
Residual (error) SSE (r1) (c1) MSE=SSE/(r1)(c1)
SST (r c1)
The randomized block design is widely used in many types of
business research experiments. For example to determine the
difference in output of various types of machines, we may be
able to isolate the effect due to difference in efficiencies of
works by assigning machine at random to randomly selected
workers. The underlying idea in this kind of experiment is to
compare effect all treatments within a block of experimental set
up by eliminating possible environmental effects.
By comparing mean square of treatments by the means square
of remainder it can be determined by F test whether the
treatments have any effect, regardless of the fact of possibility
of a significant variation from block to block.
Activity 5.9
Why Randomized Block Design one of the very popular research
design?
[www.stat.wisc.edu/~wardrop/ssg/chap4.pdf]
Give 5 examples where RBD is used in business research.
ADVANTAGES OF COMPLETELY RANDOMIZED EXPERIMENTAL
DESIGN
Such a design offers the following major advantages:
a. very easy to plan out the design
b. provides a great degree of flexibility because any number
of factors types and replications may be used.
c. Analysis of such a design through statistical methods is
rather simple. This is so even in the cases when a number
of replications for each factor type or if the experimental
errors are not similar from type to type of this factor.
d. Even when data are missing or rejected, the method or
analysis is quite simple in completely randomized block
design. The loss of information due to missing data is
limited as compared to any other experimental design.
Major drawback of this design is that it is suited when the
number of treatments is small and experimental conditions are
homogenous. When the number of treatments is larger, it is
possible to select designs which are more efficient than the
completely randomized design. Therefore, randomized designs
are rarely used for field experiments where numbers of
treatments are relatively larger.
Activity 5.10
What are the drawbacks of completely randomized experimental
designs?
..
Where such design should be not used?
..
What are the various ways in which problems in getting
cooperation be overcome?
..
(www.youtube.com How to get cooperation)
Solved example 1
Apply the technique of analysis of variance to the following data
relating to yield of 4 varieties of wheat in 3 blocks in
Randomized Block Design
Varieties BLOCKS
A B C
I 10 9 8
II 7 7 6
III 8 5 4
IV 5 4 4
[Type text] Page 165
Solution
Varieties BLOCKS
A B C
I 10 9 8 27
II 7 7 6 20
III 8 5 4 17
IV 5 4 4 13
Total 30 25 22 77
Correction factor =T
2
=
(77)
2
=
5929
=
494.1
N 12 12
Sum of Squares between columns =
(30)
2
(25)
2
(22)
2
4 + 4 + 4  C.F
Sum of square between rows (SSR)=
 C.F
494.1
Total sum square (SST)
=(10)
2
+(9)
2
+(8)
2
+(7)
2
+(7)
2
+(6)
2
+(8)
2
+(5)
2
+(4)
2
+(5)
2
+(4)
2
+(5)
2
+
(4)
2
+(4)
2
C.F
=100+81+64+49+49+36+64+25+16+25+16+16C.F.
=541494.1=46.9
Residual of sum of square = 46.9(34.8+8.15)=3.95
ANOVA table
Source of
variation
Sum of
square
Degree of
freedom
Mean
Square
FRatio Ftest
Between
Column
8.15 (c1)=3
1=2
MSC=SSC
4.075 df
4.075
.66
=6.17
F.05(2.6)
=5.14
Between
Rows
34.80 (r1)
=41=3
MSR=SSR
11.6 df
11.6/.66
= 17.6
F.05(3.6)
=4.76
Residual
(error)
3.95 (c1)(r1)
2x3 = 6
MSE=0.66
Total 46.90
The calculated values of F are greater than the critical value at
5% level of significance, hence mean between columns and
rows differ significantly.
LATIN SQUARE EXPERIMENTAL DESIGN
This experimental design also emerged out of agronomic
experimentations and is extensively used where there is a need
to eliminate the trend of soil fertility in two directions
simultaneously. In such a design data is classified in rows and
columns according to different treatments and varieties and is
organized in the form of a square which is called a Latin Space.
The genesis of a term Latin Square came from a mathematical
puzzle that was devised many years before such experiments
came into being. In such a design, since there have to be as
many replications as are treatments, the domain of experiment
is divided into slots organized in a square in a manner that they
are as many slots in each row as there are in each column. This
number is also same as the number of treatments. These slots
are then assigned to various treatments in a manner for each
treatments occurs only once in each row and only once in each
columns. This can be organized in a large number of ways.
However, particular way in which any particular layout is done
must be determined randomly. Let us take a case where the
data is to classified in rows and columns and treatments (which
are being represented by letters ABCD) etc. In this case a Latin
Square would an arrangement of letters (i.e. treatments) in a
Square such that each letter occurs only once in each row and
only once in each column. If we have to generalize it, then Latin
Square of n order has the arrangement of letters in a square
such that there will be n rows, n columns, and n treatments.
However, each letter will occur only once and only once each
row and each column. However, total occurrences of the letter
in a Latin Square would be n times. Various possible
permutations and combinations can exist and can lead to
different Latin Square constructions. However, it may be noted
each letter denoting a treatment or variety appears once and
only once in each column and each row. For example various
orders of a Latin Square is shown in the figure given on next
page:
3x3
A B C
B C A
C A B
4x4
A B C D
B A D C
C D A B
D C B A
5x5
A B C D E
B A E C D
C D A E B
D E B A C
E C D B A
Thus the total number of possibilities in which arrangement of
Latin Squares can be made is very large. The totals are given in
the following table:
Size of Squares No of different squares
possible
2x2 2
3x3 12
4x4 576
5x5 161,280
6x6 812,851,200
7x7 61,479,419,904,000
IMPORTANCE OF LATIN SQUARE DESIGN
Latin Square is an important design of experiment as it provides
a wide range of choice to an experimental researcher. In
addition to agronomic experimentations this design is used in
industrial, educational, sociological, psychological and medicinal
experimentation. It must be noted that Latin Square Design is
extremely useful for the researcher intending to remove from an
analysis the effect of a factor the researcher is not interested in
but it is known to have significant bearing. If researcher is not
careful to organize these experiments in a manner such that
effect of this factor is separated from other effects it may be
found that researcher has compounded that effect with the
effect in which researcher was interested.
The major advantages of Latin Square Experimental Design over
other such designs are:
1. The two way stratification of Latin Square Design leads a
better control of the variation than the Completely
Randomized Design or the Randomised Block Design
2. The two way stratification leaders to elimination of
variation which often results in a small error mean
square.
3. By and large, analysis is still simple, however, it may be
slightly more complex than an analysis for Randomised
Block Design
4. Analysis remains relatively simple with Latin Square
Design even if some of the data are missing. There are
procedures available to analyse Latin Square in cases one
wishes to omit one or more treatments, rows or columns.
5. A major drawback of Latin Square Design may be that
number of the treatments must be equal to the number
treatments of the rows and columns. Also when number
of treatment is more than seven Latin Square Design
hardly is ever utilized, due to complexity in number of
permutations and combinations.
MAJOR ASSUMPTIONS IN ANALYSIS OF LATIN SQUARE
The Latin Square Experimental Design assumes the fact that
interactions between treatments and row/columns grouping
are nonexistent. Since in such a design each treatments
appears only once in each row/column if interactions are
present then it is possible for interactions to cause
apparently significant difference between treatments.
Therefore, it is imperative to choose row/column in a
particular Latin Square Design in a random manner.
Interactions if present can than be considered as random
elements that are part of the errors in treatments. This may
lead to at times to magnify the error variance and thereby
make the design not very efficient. However, randomization
of treatments may still lead to a valid theoretical design.
STEPS IN CONSTRUCTION OF LATIN SQUARE DESIGN
The Latin Square Design is constructed carrying out of
following steps:
1. Correction factor is computed by squaring the grand total
and than dividing by the number of observations.
2. Total sum of square (TSS) is computed by adding squares
of the individual data subtracting the correction factor
from it.
3. The row sum of square (SSR) is computed by adding the
squares of the column sums then dividing by the number
of data in the column and finally subtracting the
correction factor.
4. The treatment of squares (SST) is computed by adding
the squares of the column sums then dividing by the
number of data in the column and finally subtracting the
correction factor.
[Type text] Page 173
5. The residual sum of squares (SSE) is computed by
subtracting the sum of 4 and 4 above from 2.
6. These sums of squares are then entered in an analysis of
variance table and variance means squares are computed
as shown below.
7. The last step then is to calculate F by comparing
Treatment Mean Square (MST) with Residual Mean
Square (MSC).
ANALYSIS OF VARIANCE TABLE
Source of
Variance
S.S Degrees of
freedom
Mean Square
Rows SSR (n1) MSR=SSR/n1
Columns SSC (n1) MSC=SSC/n1
Treatments SST (n1) MST=SST/n1
Residual or error SSE (n2) (n2) MSW=SSE(n
1)(n2)
Total TSS (n
2
1)
[Type text] Page 174
RANDOMISED BLOCKS VISVIS LATIN SQUARE
The Randomised Block Design is considered better than the
Latin Square Design in several ways and is available for wider
range of treatments with no restriction on the number of
replications. In the case of randomized block design the analysis
of variance is also more flexible. In case of some loss of data in
some blocks the data in these blocks can easily be omitted
without any hassles in the analysis. Whereas in such a situation
Latin Square Experiment necessitates much more complex
analysis.
EXTENSION TO LATIN CUBES
The inherent idea of a Latin Square can also be extended to
three dimensions. In such cases, a Latin Square Design converts
to Latin Cubes. However, in practice application of data cubes
are generally few since rather complex analysis is involved.
FACTORIAL EXPERIMENT
In the recent times with a view to improve rational foundation of
a scientific experimentation the Factorial Design has proved to
be one of the useful developments. Factorial Experiments allow
the researcher to evaluate the combined effect of two or more
variables when used simultaneously. It is considered that
information obtained from Factorial Experiments is more
complete than that which is obtained from a set of single factor
experiments. This is due to that fact that Factorial Experiments
allow the evaluation of interaction effects. An interaction effect
is generally attributed in two or more combination of variable
[Type text] Page 175
over and above those that can be predicted from the variable if
considered alone.
Major reasons for including several factors in one experiment
are:
a. Understanding the overall affect of the factors
economically by conducting one single experiment of
moderate size.
b. To enlarge basis of inference on single factor by testing it
under graded conditions of other factors.
c. Find out the manner in which the effect of factors interact
with one another. These may not be entirely independent
but emphasis can be made to vary with a degree of
experimentation.
Activity 5.11
Write a note on problem of consciousness
..
(http://en.wikipedia.org/wiki/HardProblemofConsciousness)
DIFFICULTIES IN EXPERIMENTAL DESIGNS
Following problems are generally come across in experimental
designs
1. Problems in experiment setting: Generally it is not
easy to determine the conditions under which
experiments should be set up. In case scientific
experiments laboratories conditions may be established
but this may not be possible in case of sociological
psychological experiments. In case of conduct of
experiment in a natural setting, problem arises in varying
various treatments and other conditions.
2. Problems in getting cooperation: in case of business
and social research to obtain cooperation from people
who form subject of experimentation is not easy. Human
subjects at times work according to their free will. A lack
of interest also at times makes cooperation impossible.
3. Difficulties in establishing control: Control at times in
an experimental situation more so in case of complex
business and socioeconomic research is lost since it is
very difficult to get complete knowledge of various factors
influence the experiments. Believe it or not even the
known facts at time are difficult to be controlled.
4. Problems of consciousness: In case of business
experimental design experimental subject is rather fluid
and possesses a consciousness which limits the degree of
experimentation.
[Type text] Page 177
SUMMARY
In this chapter the definitions of experimental design, factorials,
levels, treatments, experimental units, response, effect,
interaction, randomization, replication, and experimental error
have been explained. The need in having experimental designs
has been discussed. The basic factors to be evaluated in
experimental design have been described. Various steps of
conducting the experimental design have been highlighted.
Various types of experimental designs viz completely
randomized design, randomized block design have been
explained.
REVIEW EXERCISES
1. Define the term experiment and experimental design.
What is the need of experimental design?
2. Explain the basic principles of experimental designs
3. Describe some of the important research designs used in
experimental hypothesis testing research study.
FURTHER READINGS
 Ryan, Thomas. P., (2007). Modern Experimental Design.
Malden MA: WileyInterscience.
 Dean, Angela M. & Voss, Daniel., (2000). Design and
Analysis of Experiments. New York, NY: Springer
Publishing Company.
[Type text] Page 178
 Cochran, W.G. & Cox, G.M., (1992). Experimental
Designs, New York City, NY : Wiley.
BLOCK
2
DATA COLLECTION AND MEASUREMENT
CHAPTER 6 METHODS AND TECHNIQUES OF DATA
COLLECTION
CHAPTER 7 SAMPLING AND SAMPLING DESIGN
CHAPTER 8 ATTITUDE MEASUREMENT AND SCALES
CHAPTER 6 METHODS AND TECHNIQUES
OF DATA COLLECTION
Objective
After reading this Chapter the learner will be able to:
 Understand the concept of data and its characteristics
 Know about and distinguish between primary and
secondary data
 Appreciate the sources of secondary data and its
verification
 Understand and apply various methods of data collection
Structure
 Data Defined
 Characteristics of Data
 Primary Data
 Secondary Data
 Distinction Between Primary and Secondary Data
 Verification of Secondary Data
 Characteristics of Secondary Data
 Sources of Secondary Data
 Methods of Data Collection
 Summary
 Review Questions
 Further Readings
[Type text] Page 181
DATA DEFINED
Data in the plural sense implies a set of numerical figures
usually obtained by measurement or counting. Data refers to
numerical description of quantitative aspects of things. For
example, data of students of a college include count of the
number of students, and separate count of number of various
types of students such as, male and female, married and
unmarried, or under graduates and post graduates. It may also
include such measures as their height and weights.
CHARACTERISTICS OF DATA
In order that numerical description may be called data, they
must possess the following characteristics.
i) Data is aggregate of facts: For example, single
unconnected figures can not be used to study the
characteristics of a business activity.
ii) Data is affected to a large extent by multiplicity of
factors: For example in business environment the
observations recorded are affected by a number of
factors (controllable and uncontrollable).
iii) Data is estimated according to reasonable standard
of accuracy: For example in the measurement of
length one may measure correct upto 0.01 of a cm.,
the quality of the product is estimated by certain tests
on small samples drawn from big lots of products.
iv) Data is collected in a systematic manner for a
predetermined objective: Facts collected in a
haphazard manner and without a complete awareness
of the objective will be confusing and can not be made
the basis of valid conclusions. For example, collected
data on price serves no purpose unless one knows
whether he wants to collect data on wholesale or retail
prices and what are the relevant commodities under
considerations.
v) The Data must be related to one another: The data
collected should be comparable, otherwise these can
not be placed in relation to each other, e.g. data on
the yield of crop and quality of soil are related but the
crop yields cannot have any relation with the data on
the health of the people.
vi) Data must be numerically expressed: That is, any
facts to be called data must be numerically or
quantitatively expressed. Qualitative characteristics
such as beauty, intelligence etc. are called attributes,
and must be scaled to express in numeric terms.
Activity 6.1
Comment whether data collected on room temperature in an
office and efficiency of workers are related or not?









Comment whether data expressing emotional stability in terms
of EQ (Emotional Quotient) numerically expressed.







[Type text] Page 184
PRIMARY DATA
These data are collected first time as original data. The data is
recorded as observed or encountered. Essentially they are raw
material. They may be combined, totaled but they have not
been statistically processed. For example data obtained in a
population census by the office of the Registrar General and
Census Commissioner, Ministry of Home Affairs is primary data.
ADVANTAGE
The advantages of primary data are:
i) Primary data is more accurate and gives detailed
information according to the requirement.
ii) The explanation of terms, definition, and concepts are
incorporated in primary data.
iii) Methods of collection, its limitations and other allied
aspects are highlighted.
iv) It is more reliable and less prone to errors.
v) It often includes a copy of the schedule and
description of the procedure used in selecting the
sample and collecting the data.
Limitations
i) It is expensive to collect primary data.
ii) It is time consuming method of data collection.
iii) It requires experts/trained personnel to collect the
primary data. Otherwise it may lead to wrong
observations/unreliable data collection.
Activity 6.1
Suppose you were to ascertain consumption of water in a
particular locality. Would you collect primary data to do so? If so
what will be advantages and disadvantages?









Is there any other way you can come to conclusion? What could
it be?









SECONDARY DATA
This is also known as published data. Data which are not
originally collected but rather obtained from published sources
and statistically processed are known as secondary data. For
example data published by Reserve Bank of India, Ministry of
Economic Affairs, Commerce Ministry as well as international
bodies such as World Bank, Asian Development Bank,
International Labour Organization etc.
ADVANTAGE
i) Less costly as data is already available.
ii) It is faster to collect and process as compare to
primary data.
iii) It provides valuable insights and contextual familiarity
with the subject matter.
iv) It provides a base on which further information can be
collected to update it and finally use it for the purpose
of research.
DISADVANTAGE
i) Locating appropriate source and finally getting access
to the data could be time consuming.
ii) The data available might be too vast and a lot of time
may be spent going through it.
iii) It might have been originally collected for some
purpose which is specific and not known to the present
researcher. To that extent, it might be erroneous to
use it.
iv) The accuracy of secondary data as well as its reliability
would depend on its source.
v) It might not be updated and not of much use in a
dynamically changing environment.
Activity 6.5
A mall is opening in a particular locality? To determine what kind
of stores it should have. You need to have socioeconomic data
of population living near the mall. Would you use secondary
data and if yes, why?







What could be some disadvantage of doing so?







[Type text] Page 188
DISTINCTION BETWEEN PRIMARYAND
SECONDARY DATA
Description Primary Data Secondary Data
1. Source of
collection of data
Primary Source:
Such as
Observation,
Questionnaire
Methods
Secondary Source:
Published data of
Government
Agencies,
Trade Journals etc.
2. Statistical Process Statistical process
is not done.
Statistical process is
done
3. Originality of data Original, first
time collected by
the user
Secondary data is
not
Original: data are
collected by some
Other agencies
4. Use of data For specific
purpose, data are
compiled
The data are taken
from some other
source and used for
decision making
5. Terms and
Definition of units
based
Terms and units
are incorporated
Terms and units are
not incorporated.
6. Copy of the
schedule and
description of
procedure of data
collection and
sample selection
Included Excluded.
However, on a closer investigation, it will be noticed that the
distinction between primary and secondary data in many cases
is one of degree only. Data which are secondary in the hands of
one man may be primary for others. For example to a bank the
details of the customers are primary data, to a reader of the
report of the bank, they are secondary.
Activity 6.4
Comment whether the compilation of marks obtained by
students in term examination primary data for class teacher and
secondary data for parents!









VERIFICATION OF SECONDARY DATA
The secondary data is readily available and acquisition of such
data is not problematic. Yet its use is not easy unless the
purpose of the investigation is the same as that of the
investigation for which the data was originally collected. This is
rarely true. It is therefore necessary that the secondary data
should be used with due precaution. It is due to this reason,
that the secondary data, when used, should be edited and
verified. Inconsistencies might be there. Probable errors cannot
be discounted. Omissions may be frequent. There is possibility
of inaccuracy in original data collection. The data may be
inadequate, unreliable and unsuitable. Therefore they should be
scrutinized, and edited before use. It is never safe to take
[Type text] Page 190
published data at their face value without knowing their
meaning and limitations.
CHARACTERISTICS OF SECONDARY DATA
The secondary data should possess the following characteristics:
a) Reliability of Data: Reliability of data can be established
from the following information:
i) who collected the data and from which sources
ii) the method used in collecting the data
iii) whether census or sampling method is used in
collecting the data.
iv) whether compiler and source both are dependable
v) the purpose for which the data were originally
collected.
b) Suitability of Data: The secondary data should be
suitable for the enquiry. Even if the data are reliable they
should not be used if the same are found to be unsuitable
for enquiry. Data may be suitable for one enquiry and
may not be suitable for another. For checking suitability
of data one should see that.
i) what is the object of enquiry?
ii) The definitions of various terms and units of
collection must be carefully scrutinized.
iii) What is the standard of accuracy aimed at?
[Type text] Page 191
iv) Age of the data. At what time frame the data was
collected.
v) Do the data refer to homogenous population?
c) Adequacy of the data: The secondary data may be
reliable and suitable but the same may be inadequate for
the purpose of current investigation. The data collected
earlier may refer to an area which is narrower or wider
than the area required for the present enquiry, and if it is
such, the data should not be used at all. The secondary
data may not cover the period suitable to the enquiry.
The degree of accuracy of the original data may be found
to be inadequate for the enquiry.
SOURCES OF SECONDARY DATA
Following are the main sources of secondary data.
i) Official Publications: Publications of the Central
and State Governments, Government of foreign
countries or international bodies, etc.
ii) Semi Official Publications: Publication of the
Semi Government bodies, e.g., Municipal/District
Board, Corporation etc.
iii) Publication relating to Trade: Publication of the
trade associations, chamber of commerce, banks,
cooperative societies, stock exchange, trade unions
etc.
[Type text] Page 192
iv) Journal/Newspapers etc. : Some newspapers/
journals collect and publish their own data, e.g.,
Indian Journal of Economics, Economist, Economic
Times, Far Eastern Review etc.
v) Data Collected by Research Agencies: Research
agencies like MARG Nielsen and Gallys also collect
useful data which are available as data bases upon
payment.
vi) Unpublished Data: Data may be obtained from
several companies, organizations, universities etc.
working in the same areas, and who have done
very good work. For example Data on Energy
Conservation by The Energy Resources Institute
(TERI) can be utilized by private and public sector
companies active in this area.
METHODS OF DATA COLLECTION
The following methods of data collections are generally used.
i) Observation method
ii) Personal Interview
iii) Forms, schedules or questionnaire method.
iv) Documented sources of data
v) Case study method
OBSERVATION METHOD
This is the most commonly used method data collection specially
in studies relating to behavioural sciences. Accurate watching
and noting of phenomenon as they occur in nature with regard
[Type text] Page 193
to cause and effect or mutual relation is called observation
method of data collection.
CHARACTERISTICS OF OBSERVATION METHOD
These are as follows:
i) Direct Method: In observation method data is
collected through direct contact with phenomenon
under study. In this method sensory organs
particularly eye, ear, voice are used.
ii) Source of Primary Data: That is a classical
method for collection of primary data.
iii) Requires Indepth study: In this method, the
observer goes to the field and makes the study of
the phenomenon in a indepth fashion to acquire
data.
iv) Collection follows observation: In this method
the investigator first of all observes the things and
then collects the data.
v) Relationship between the cause and affect:
Observation method leads to development of
relationship between the cause and effect of the
events.
vi) Scientific method for collecting dependable
data: This is the most scientific method for
collection of dependable data. Observations are
planned and recorded systematically. There should
be checks and balances on this methodology.
[Type text] Page 194
vii) Selective and Purposeful collection: The
observations are made with definite purpose.
Collection of materials is done according to a
particular purpose.
MERITS OF OBSERVATION METHOD
These are as follows
a) Common method: The method of observation is
common to all the discipline of research.
b) Simplicity: The method is very simple to use.
c) Realistic: Since observation is based on actual and
first hand experience, its data are more realistic
than the data of those techniques which are
indirect and secondary source of information.
d) Formulation of hypothesis: In all the business
operations, the method of observation is used as
the basis of formulating hypothesis, regarding
business research problem.
e) Verification: For verification of hypothesis, again
we depend upon observation. Therefore, it can be
said that the problem presents itself and resolves
itself through observation method.
f) Greater reliability of conclusions: The
conclusions of observations are more reliable than
nonobservation conclusions, because they are
based on first hand perception by the eyes and can
be verified by any one by visual perception.
[Type text] Page 195
On account of the above advantage, the observation method is
called Classical Technique of Investigation for research
purposes.
Activity 6.5
List few research pursuits where the observation method of data
collection would be useful?









LIMITATIONS OF OBSERVATIONS METHOD
i) Some events can not be objects of observation:
There are certain events which are microscopic, indefinite
and may not occupy any definite space or occur at a
definite time and can not be noticed for observation
purposes. For example, it is not possible to observe
emotions and sentimental factors, likes and dislikes etc.
ii) Illusory observation: Since we have no depend upon
our eyes for observation, we can never be sure if what we
are observing is the same as it appears to our eyes, Eyes
are prone to deception. It is well known that eyes see a
mirage in desert at noon.
iii) Selfconsciousness in the observed: In observation
method, the atmosphere tends to become artificial and
this leads to a sense of self consciousness among the
individuals who are being observed. This hampers their
naturalness in behaviour and thus the purpose of
observation which is to know the behaviour of individuals
under normal conditions get defeated.
iv) Subjective explanation: The final results of observation
depend upon, the interpretation and understanding of the
observer, the defects of subjectivity in the explanation
creep in description of the observed and deductions from
it. For example, if we see a man coming out of a wine
shop, quite drunk, and he starts firing at random, we may
believe that liquor induces irrational violence in a man,
which may not be the case always.
v) Slowness of Investigation: The slowness of
observation methods lead to disheartening, disinterest
among both observer and observed,.
vi) Expensive methodology: Being a long drawn process,
the technique of observation is expensive.
vii) Inadequacy: The full answer cannot be obtained by
observation alone, observation must be supplemented by
other methods of study.
[Type text] Page 197
INTERVIEW METHOD
Under this method of collecting data there is a face to face
contact with the persons from whom the information is to be
obtained (known as informants). The interviewer asks them
questions pertaining to the survey and collects the desired
information. Thus if a person wants to collect data about the
working conditions of the workers of Hindustan Unilevers Ltd.,
Mumbai, he would go to works at Mumbai, contact the workers
and obtain the information. The information is obtained at first
hand and is original in character.
CHARACTERISTICS OF INTERVIEW METHOD
The following are the main characteristics of interview method.
a) It is a close contact or interaction including
dialogue between two or more persons.
b) There is a definite object of interview, such as
knowing the ideas and views of others.
c) There is a face to face contact or primary
relationship between the individuals.
d) This is the most suitable method of data collection
for business and economic problems.
MERITS OF INTERVIEW METHOD
As compared with other research methods, the method of
interview possess some unique qualities. These are:
1. Direct Research: In the interview method, the
researcher has not to confine himself to the external
aspects of human behaviour; but can probe into the
[Type text] Page 198
internal aspects as well. Furthermore, in the interview the
barrier between the researcher and the respondent is
eliminated any they know each other directly.
2. In depth research: This characteristic follows from the
above mentioned fact that the interview studies the
internal aspects of the research problem, which are
inaccessible to other methods. Accordingly, in comparison
with other methods, the interview method is a method of
indepth research.
3. Knowledge of past and future: In interview we also
learn about the outlook, aspirations and future goals of
human beings and their present abilities. Accordingly, by
interview we unravel the hidden past and prognosticate
about the future.
4. Mutual encouragement: In an interview there is inflow
and outflow of ideas between the interviewer and the
interviewee. This exchange proves encouraging for both
the researcher and the respondent.
5. Supraobservational: An interview gives us knowledge
of facts which are inaccessible to observation. The
emotional attitude, secret motivation and incentives
governing human life come to surface in an interview
though these are unobservable. Therefore interview has a
quality which may be called supraobservational.
6. Examination of known data: The information given by
the interviewee, if suspect, can be tested through cross
examination of the interviewee. Moreover, body language
[Type text] Page 199
accompanying the responses give a clue to the
interviewer about the veracity or otherwise of the answer
being given.
Activity 6.6
In the research which looks into customer preference interview
method is often method. Comment whether









LIMITATIONS OF INTERVIEW METHOD
Inspite of the abovementioned merits of the method of
interview, it suffers from certain limitations which are:
1. Inadequate information: There are certain matters
which can be written in privacy but about which one
does not speak before others. If these matters are
subject of an interview, the likelihood is that only a
disguised version of these will be presented. Again,
there are people who are temperamentally unable to
discuss things though they are powerful writers. These
persons are also unlikely to present true facts in an
interview.
2. Defects due to interviewee: If an interviewee is of
low level intelligence he is unfit to give correct
[Type text] Page 200
information. Some persons are in the habit of talking
in around about manner and it is impossible to
decipher what they say.
3. Prejudices of Interviewer: The prejudices of
interviewers are as much problem of research as are
the inadequacies of the interviewees. If the
interviewer is unable to suppress his prejudices, his
understanding and interpretation of the information
given in interview will be defective.
4. Onesided and incomplete research: In the
interview, certain aspects of human behaviour get
overemphasized at the expense of others. There is a
tendency to give too much importance to personal
factors and minimize the role of the environment
factors. For these reasons, the research by interview is
liable to suffer from onesidedness.
5. Interviewing is an art rather than science:
Another limitation of interview method is that its
procedures cannot be standardized, there is too much
room for improvision. The success of an interview is
more due to skill and tact than due to knowledge.
However, the success in interview depends exclusively
on the intelligence and skill of the interviewer.
Therefore, the method of interview is more an art than
science.
6. Difficulty in Persuading the Interviewee: Many
people are unwilling to participate in interviews. Under
these circumstances, the first problem before the
interviewer is to persuade the prospective interviewee
to extend his cooperation for the research project and
agree for being interviewed.
TYPES OF INTERVIEWS
The chief types of interviews are as follows:
1. Classification according to formalness: The formal
classification of interview gives us two main types.
a) Formal Interview: In this type of interview, the
interviewer presents a set of well defined questions
and notes down answers of information in accordance
with prescribed rules.
b) Informal Interview: In contrast with the formal
interview the interviewer has full freedom to make
suitable alterations in the questions to suit a particular
situation in formal interview. He may revise, reorder
or paraphrase the question to suit the needs of the
respondents.
Classification according to the number: Anther classification of
interview is according to the number of persons taking part in it.
Following are its main types.
i) Personal Interview: In personal interview single
individual is interviewed. The personal interview help
to establish close personal contacts between the
interviewer and the interviewee and as a result
[Type text] Page 202
detailed knowledge about intimate and personal
aspects of the individual can be had.
ii) Group Interview: As the name makes it plain the
group interview is the opposite of the personal,
because in it two or more persons are interviewed.
The interview is suited for gathering routine
information.
2. Classification according to purpose: The interviews have
also been classified by the purpose for which they are
held. Following are the types of this classification:
i) Diagnostic Interview: As the name makes
clear, this type of interviewers, try to
understand the cause or causes of a malady. In
clinical psychology and psychoanalysis, the
preliminary interviews with the patients are held
with a purpose to grasp the nature and cause of
disease.
ii) Research interview: These interviews are held
to gather information pertaining to certain
problems. The questions to be asked to gather
the desired information are predetermined and
by asking them of the informations the data is
collected. In as such as this data is gathered for
the purpose of research into a problem, these
are called Research Interviews.
iii) Interviews to fulfil curiosity: These
interviews, as the name implies, are held to
[Type text] Page 203
satisfy some question lurking in the mind of a
scientist. For example, if a scientist gets an idea
that good lectures are delivered extempore, he
has to interview some reputedly good lecturers
whether they make extensive notes for
delivering a lecture or not.
3. Classification according to the period of contact:
The different types of problem require different amount of
time for contact with respondents. The time can be short
or long. Accordingly , two types according to time are
follows:
i. Shortcontact interview: For fillingup
schedules etc., a single sitting of small duration
suffices. Therefore, in researches of this type
short contact interview suffices.
ii. Prolonged contact interview: In contrast with
research by schedule, the casehistory method
requires prolonged interviews. In these
establishment of close personal relations between
the interviewer and interviewee is very likely.
4. Classification according to subjectmatter: The
classification of interviews according to subjectmatter
gives us the following three types:
i. Qualitative interview: The qualitative interviews
are about complex and non quantifiable subject
matter. For example, interviews held for case
studies are qualitative, because the interviewer has
[Type text] Page 204
to range over past, present and future to know
enough about a case
ii. Quantitative interview: The quantitative
interviews are those in which certain set facts
gathered about large number of persons. The
census interviews are its example
iii. Mixed interview: In certain interviews both types
of data the routine and specialized is sought, part
of it is quantifiable while the rest is not. Therefore
it is known as mixed interview.
MEANS OF ELICITING CORRECT RESPONSES IN AN INTERVIEW
The main concern of the researcher employing the method of
interview is to get correct and to the point answers on the topic
of research.. A research can be less expensive and economical
only if the entire concentration of everybody involved in it is on
the topic and if deviation from the main line of approach is
checked. Normally the veracity of responses depends upon the
skill and tactful approach of the interviewer and no rules can be
framed in this connection. Still the following points should be
kept in mind:
1. Narrative Style: For allowing maximum opportunity
of self Expression to the interviewee, he should be
allowed to tell his experience in the story form. That
is he may narrate his experiences inconnected aand
sequential form.
2. Freedom of Description: The interviewee should
be allowed to describe whatever he thinks
[Type text] Page 205
worthwhile, Even if some irrelevant facts are being
described, the interviewee needs not be checked. As
long as he remains within reasonable limits of the
main topic in question, he should not be pulled up.
3. Alert Direction: Though maximum freedom of self
expression is desirable, this can only be within the
scope of problem being discussed. This requires
alertness and direction at the suitable occasion.
4. No harshness in direction: Under no
circumstances can an interviewer afford to offend the
interviewer. Good humour is the essence of
successful direction.
5. Interested hearing: The interviewer must hear the
interviewee with full interest. No body should be able
to guess from his expressions that he is bored or his
mind is elsewhere
6. Application of interviewees: If an interviewer can
convince the interviewee that he appreciates his
cooperation and greatly values the information given
by him. This word of encouragement has a salutary
effect on the interviewee, who then gives correct
responses
7. Objective attitude: The comments, wherever
necessary, must not smack of personal views, likes
and dislikes of the interviewer. He must remain
objective and should not let his personal feeling
[Type text] Page 206
interfere, He must at all costs avoid criticism of
interviewees behavior.
8. Examination of known data: The information
given by the interviewee if suspect can be tested
through cross examination of the interviewee.
Moreover, the emotional expressions accompanying
the response give a clue to the interviewer about the
veracity or otherwise of the answer being given.
QUESTIONNAIRE METHOD
Under this method, a list of questions pertaining to the survey
(known as uestionnaire) is prepared and sent to the various
informants by post. Questionnaire contains the questions and
provides the space for answers. A request is made to
respondants through a covering letter to fill up the questionnaire
and send it back within a specified time.
TYPES OF QUESTIONNAIRE
The questionnaire may be of following types:
a) Structured Questionnaire: structured Questionnaires
are those in which a question is presented the
respondents with fixed response categories.
b) Unstructured questionnaire: Here every question is
not necessarily presented to the respondent in the same
wording and does not have fixed responses. Respondents
are free to answer the question the way they like.
c) Mixed Questionnaire: This is a questionnaire which is
neither completely structured nor unstructured. It
consists of both the types of questions.
[Type text] Page 207
d) Disguised typed questionnaire: Here the questions are
direct and therefore respondents may give answer to
certain sensitive questions whose accuracy may be
questioned.
PROCEDURE OF ORGANIZATION OF RESEARCH THROUGH
QUESTIONNAIRE
The questionnaire method is frequently used in gathering the
data. It is used to collect data from a large , diverse, and widely
scattered group of people. The following steps are taken in
organization of research through questionnaire method.
i. Framing of questionnaire scientifically to meet the data
requirements.
ii. Compiling the names and addresses of the respondents.
iii. Pretesting of the questionnaire to judge its suitability
and utility.
iv. Despatch of the questionnaire to all the respondents.
v. A code or serial number is given to each case
vi. Receipt of questionnaire from the respondents should be
recorded date wise.
vii. Proper follow up for receiving of adequate response.
viii. If proper response is not coming even after third
reminder change the names of respondents and follow
the above procedure.
[Type text] Page 208
EXHIBIT 6A
Types of Questions Encountered in Devising A Questionnaire
1. Dichotomous questions: When reply to a question is in the form of one
out of the two alternatives given, One answer being given in negative and
other positive, it is called dichotomous question. But negative and positive
answers, combined together form the whole range of answers given. Below
is an example of this type of questions:
Whether respondent is educated. Yes/No
2. Multiple choice questions: These type of questions are also known as
cafeteria questions. In these questions, a large number of alternative
answers are given. These alternatives are quite comprehensive and the
respondent has to select one of them.
In framing these types of questions, the framer has to be cautious enough
that all the alternatives are included in it and they are mutually exclusive
3. Leading Questions: These are suggestive questions. They are also
known as attractive questions. In these types of questions, the reply is
suggested in a particular direction. The reply to these questions is not
necessarily in Yes or No.
Leading questions, as far as possible should be avoided. In case, they are
not avoided, they can themselves suggest an answer.
4. Ambiguous questions: Those questions that lack clarity and are so
worded that the meaning is not clear, are called ambiguous. Answers to
such questions cannot be secured in a clear cut manner and more than one
replies are available.
[Type text] Page 209
Such questions should, as far as possible, be avoided from being included in
a questionnaire. They are likely to confuse the process of collection of data.
5.Ranking item questions: The multiple questions contain a number of
alternatives in the form of replies and the ranking type questions are so
designed as to record the preference of the respondent. In multiple choice,
there is one answer, But in ranking item questions there may be several
preference arranged item wise. The respondent may indicate several
preference.
Activity 6.7
Why there is a very low response rate to questionnaire sent?
What can be done to improve the response rate?









ADVANTAGE OF QUESTIONNAIRE METHOD
This method is an indirect method of data collection. It has
certain advantages as compared to other methods. It is merits
are as follows:
i) Economical: In comparison to other methods of data
collection (observation methods, case study, interview
etc.) the mailed questionnaire method is cheapest and
quickest method. The cost in this method is only that of
getting the questionnaire prepared and the postage
[Type text] Page 210
expense. There is no need to visit the respondents
personally or continue the study over a long period.
ii) Less skill of administration: The questionnaire method
requires less skill to administer than an interview,
observation or case study method of data collection.
iii) Research in wide area: If the informants or the
respondents are scattered in a large geographical areas,
the Questionnaire method is only means of research. The
other methods of data collection such as schedule,
interview or observation method do not prove to be
successful. Even after spending large amount of money, it
may not be possible to collect the information quickly but
through questionnaire method, large area can be covered.
Some times certain agencies also cooperative in the task
of dispatches or sending of the questionnaire to the
informants.
iv) Time Saving: Besides saving money, questionnaire
method saves time. Simultaneously hundreds of persons
are approached through it whereas if they are to be
interviewed it may take a long time.
v) More reliable in special cases: This is a method of
collecting data in an objective manner through
standardized impersonal questions. The respondents give
free, frank and reliable information. Moreover the
informants or respondents are free to give information as
and when they want. Because of this freedom, the
[Type text] Page 211
information that is provided is more dependable and
reliable.
vi) Free from external influence: In questionnaire
method, informants or respondents are free from external
influences, as researcher is not present. They provide
reliable, valid and meaningful information based on his
knowledge, views and attitudes.
vii) Suitable for special type of responses: The
information about certain problems can be best obtained
through this method. For example, the research about
sexual habits, marital relations, dreams etc. can easily be
obtained by keeping the name of respondents
anonymous.
viii) Less errors: Chances of errors are very low, because the
supply of information is done by respondent himself.
ix) Originality: The informants are directly involved in the
supply of information, so the method is more original.
x) Uniformity: The impersonal nature of questionnaire
ensures uniformity from one measurement situation to
another.
xi) Collection of information relevant to the objective:
Through this method, the questionnaire are framed
according to the object, hence data collection is also
accordingly to that objective.
DISADVANTAGE OF QUESTIONNAIRE METHOD
The method has the following disadvantages/limitations.
a) Lack of interest: Lack of interest on the part of
respondents is very common. The respondents gets
disinterested due to large number of questions.
b) Incomplete response: Some persons give answers
which are so brief that the full meaning is
incomprehensible.
c) Illegibility: Some persons write so badly that even they
themselves find it difficult to read their own handwriting/.
d) Useless indepth research problems: If a problem
requires deep and long study, it can not be studied
through this method.
e) Inelastic: This method is very rigid since no alteration
may be introduced.
f) Prejudices and bias of the researcher influences the
questions: Since researcher frames the questions his
personal views, prejudices and the influence the
questions and he instead of becoming objective and
impersonal becomes bias and prejudicious.
g) Poor response and lack of reality: All the informants
do not give answer or do not fill the questionnaire. There
is a large percentage of those who do not send back the
questionnaire. This makes the study unreliable.
h) The incompleteness of the form of questionnaire:
Sometimes the questionnaire is itself incomplete and
some of the important aspects about which the
information is required are not given, hence data
collected is neither reliable nor helpful for the study.
i) Lack of personal contact: There is no provision in this
method for coming face to face with the respondent. This
may result manipulation of replies by the respondents.
CONSTRUCTION OF QUESTIONNAIRE
The questionnaire should be developed in a scientific manner.
While designing the questionnaire, the language and the
wordings of the questions should be kept interesting enough for
the respondents to give replies. It has to be kept in mind that in
questionnaire method, the respondent gives reply from a
distance. Therefore, psychology of respondents should be kept
in mind and the questionnaire should be framed to encourage
them to give correct answers.
STEPS IN CONSTRUCTION OF QUESTIONNAIRE
Questionnaire is framed with the help of certain background
material. It is also necessary to organize this material in a
proper manner. This procedure is known as construction of
questionnaire. Following are the steps followed.
1. Determination of Intellectual level of the
respondent: While designing the questionnaire the
intellectual level of the respondent has to be kept in view.
The questions should suit the intellectual level for whom it
is meant. If the questions, its language and its wordings,
are not in accordance with the intellectual level of the
respondent, then it would not be possible for them to
[Type text] Page 214
furnish correct replies. In such a situation the purpose of
the business research shall not be fulfilled.
2. Defining the depth of the questionnaire: The
questionnaire should enable indepth research. It means
that the question contained therein should cover all the
aspects of the research problem. It means that the
research area should be perfectly understood and the
question should be selected accordingly. If the area is not
properly defined the questionnaire shall not be framed
properly. There is every likelihood of certain aspects
being left out.
3. Use of past experience in formulation of
questionnaire: the researcher or the framer of the
questionnaire should, in order to properly construct the
questionnaire must take advantage of the past
experience. The learnings from past experience enables
researchers to know about shortcoming of similar
questionnaire. This enables the framer of new
questionnaire to remove these deficiencies to improve the
response rate.
4. Determination of utility of questionnaire: The final
step that researcher or framer of the questionnaire should
take is to make the questionnaire utilitarian. It means
that he should frame such questions which while being
useful for the proposed research are also of interest to
the respondents. In such a case the study shall be
complete and its objective shall be achieved.
[Type text] Page 215
The researcher or the framer of the questionnaire should take
certain precautions while constructing or framing the
questionnaire. For sake of emphasis these precautions are listed
as follows:
1. Questions should be simple be simple, unambiguous and
clear. The questions should not be ambiguous or couched in
difficult words and unknown phraseology. The questions
should be simple and suit the level of the intelligence of the
informants. Very complicated questions be avoided. Unless it
is done, the questionnaire is not likely to be useful.
2. Stimulating for the informants: Since answers to the
questions are to be furnished by respondents. If the
questions are not stimulating enough the informants are not
likely to provide relevant answers the whole purpose shall be
defeated.
3. Limited number of questions: If there are a large number of
questions, the respondents shall lose interest in them.
Generally the informants do not want to be bothered with too
many questions. If they feel that they are being subjected to
unnecessary work, they start giving unrelated and needless
answers.
4. Technical and special words should be clearly explained: If
the questionnaire contains certain technical and special
terminology if should be clearly explained at the beginning of
questionnaire.
[Type text] Page 216
5. Hypothetical questions should not be asked: While
formulating the questionnaire the framer should always keep
in mind not to include hypothetical questions. Answers to
such questions are small but the investigator does not gain
anything from them. Subjective and qualitative questions
should be avoided as far as possible
PRETESTING OF RESPONSE IN QUESTIONNAIRE
Pretesting and its Importance: The basic thing that has to be
kept in mind is that the error should be avoided in collecting
data through questionnaire. For this it is necessary that the
questionnaire should be tested before it is actually used as a
device in a business research study. Pretesting is nothing but
testing the validity of questionnaire it is actually used. If testing
is done in a limited group the following steps should be taken.
1. Testing the validity in a representative sample: The
questionnaire should be tested in every respect, before it is
actually mailed to the respondents. This testing can be done
in a limited number of people through sampling method but
while testing it within the sample, it should not be forgotten
that the sample should be perfectly representative.
2. Pretesting before it suits the objectives: The
questionnaire should meet the objective of research study. It
means that it should help in getting maximum possible data.
It is, therefore, necessary that it should made suitable to
objectives of study even if it requires testing more than
once. In general, the pretest should even if it requires
testing more than once. In general, the pretest should be in
the form of questing indication to objectives of research.
3. Poor response requires modification of the
questionnaire: The questionnaire is mailed to the
informants who are required to fill it in and send it back. If
the response of the informants is poor and very few
questionnaire are returned, it means modification, change
and reframing. Furthermore, it if the questionnaire returned
are incomplete or the replies are not satisfactory and up to
mark, it should be presumed that the questionnaire is
defective and it requires modification. After medication the
questionnaire should again be subjected to pretesting.
PROBLEM OF RESPONSE
The problems of response generally pose a difficult situation for
the researcher. In other methods the investigator is able to
collect information in one way or the other but in the methods
and particularly in questionnaire method, when the investigator
or the researcher is not present before the respondent, he may
not bother to provide the requisite information. The problem of
lack of response has to be tackled properly. It means that the
investigator should try to surmount the difficulties that are
responsible for poor response or lack of response. Generally the
following factors are responsible for lack of response or poor
response.
1. Importance of the problem under study: If the
problem is of importance to the respondents and they realize
this then they come out with the answers easily. It is
generally seen that those who are vitally concerned with the
problem give greater response than who are not.
2. Characteristics of the respondents and prestige
of the sponsoring body: Certain groups are more
responsive to the questionnaire method. These
characteristics relate to age, sex, economic status,
educational level, politically consciousness are more
responsive as compared to people belonging to higher
economic group. If the research study id sponsored by a well
known organization it is likely to have better response.
3. Form and nature of the questionnaire and
arrangements of the questions: Questionnaire also plays
its part in the matter of the response. If the questionnaire is
short, handy, has been printed in an attractive manner, its
layout is neat and attractive, the arrangements of the
question is scientifically planned, it is lkely to invite better
response.
4. Nature of reactions: If the respondents have a
strong feeling about the problem, they are likely fill in the
questionnaire. It means that those who are strongly in
favour or against the problem shall send the response. Those
who have lukewarm feeling about the problem are not likely
to send response in larger numbers.
5. Inducements for response: Some of the
researchers are of the4 view that the respondents normally
need inducement. May be classified under the two heads.
a) Monetary inducement: Monetary inducements are
given generally to people who are economically weak or
likely to be influenced by money considerations. This
money is given in advance or after receiving the filled
questionnaire.
b) Nonmonetary inducements: nonmonetary
inducements may be in the form of any reward. It may be
a letter of appreciation or mentioning of the name in the
report of the study or so on. Letters of appeal or
appreciation are said to be very successful method of
nonmonetary inducement.
c) Reminder and followup: Sometimes when
questionnaires are not received back, a reminder is sent.
Sometimes one reminder is sufficient and sometimes
more than one. On the basis of the study, it has been
found that followup letters or reminders are of greater
value to the respondents.
The reminders may be sent in the form of a postcard. It is said
that three reminders should be sent at regular intervals. When
no response is a available when after the third reminders the list
of such persons should be dropped from the list of respondents.
Reminders may be telephonic or through a messenger or an
intermediary. The time of sending the questionnaire also
determines the response. In case it reaches the respondents,
when they have leisure, it is likely to get higher rate of
response. Further more it has to be kept in mind that the
address is correct.
In order to have correct response and within specified time,
suitable planning and timely actions have to be taken.
USING SCHEDULE METHOD FOR DATA COLLECTION
Schedule is the name usually applied to a Performa containing a
set of questions which are asked and filled by an interviewer in
a face situation with a respondent. It is a standardized device or
tool of observation to collect the data in a objective manner. In
this method the interviewer puts certain questions and the
respondent furnishes certain answers and the interviewer
records them as in a research instrument called schedule.
PURPOSES/OBJECTIVES OF THE SCHEDULE
The main objectives of the schedules are as follow:
i) Delimitation of the topic: A schedule is always
about a definite item of research study its subject
is a single subject item rather than the research
subject in general. The schedule delimits and
specifies the subject of inquiry.
ii) Aids memory: It is not possible for the interviewer
to keep in mind or memorize all the information
that be has collected from different respondents. If
no standardized tool is available he might put
different questions to different persons and thereby
get confused when he has to analyse and tabulate
the data. Schedule acts as a aide memoire.
iii) Aid to classification and analysis: Another
objective of schedule is to tabulate and analyse the
data collected in a scientific manner. Through
schedules, he can collect the matter in a
homogeneous manner.
TYPES OF SCHEDULES
Schedules that are used in business research are classified as
under:
i) Observation schedule: The schedules that are used for
observation are known as observation schedule. In these
schedules observer records the activities and responses of
a worker or a group under specific conditions. The main
purpose of the observation schedule is to verify
information.
ii) Rating schedule: In the fields of business guidance,
psychological research , and social research ,the rating
schedules are used to assess the attitude, opinions,
attitudes, preferences, inhibitions and other like
elements. as is evident from the term rating , in these
schedules the value and the trend of the above
mentioned qualities is measured on a rating scale.
iii) Document schedule: The schedule of this type are used
to obtain data regarding return evidence and case
histories from autobiography, diary, case histories or final
records of governments etc. It is a good method for
collecting exploratory data or preparing source list.
iv) Interview schedule: In an interview schedule an
interviewer presents the questions of the schedule to
interviewee and records their responses on blank places.
CHARACTERISTICS OF A GOOD SCHEDULE
The following are the essentials or characteristics of a good
schedule.
1. Accurate communication: It means that the questions that
are given in schedule should be such that the respondent is
able to understand them in the light in which they are asked.
For accurate communication, the questions should be of the
following types.
a) Questions should be interlinked: It means that if
information about different aspects is required, the
questions asked should be such that their answers may
present compact picture of the information.
b) Suggestive questions: The questions should be
suggestive. There should be questions on each topic, but
the questions should be so designed that the respondent
may be stimulates to give the correct answers
2. Accurate response: It means that the schedule, should be
such that the required information may be easily secured.
For this the interviewer has to prepare the schedule in a
scientific manner and also make efforts to inspire the
respondent to give answers. For this the following steps
should be taken.
a) The size of the schedule should be attractive: It
should not be too lengthy to make the respondent bored.
b) The questions of the schedule should be clearly
worded and be unambiguous: Even if some
unpalatable information is to be collected the framer of
the schedule should couch his questions in such a
language that the information is secured without injuring
the feelings of the respondent.
c) The questions free from subjective evaluation: The
questions should be relevant and pointed. They should
not deal with the subjective evaluation
d) Information sought should be capable of being
tabulated: Questions of the schedule should be so
framed that the information collected through them
should be capable of being tabulated and if needed,
subject to statistical analysis.
SUITABILITY OF SCHEDULE METHOD
This method is generally employed in following situations:
a) The field of investigation is wide
b) Where the researcher/investigator requires quick results at
low cost.
c) Where the respondents are educated.
d) Where trained and educated investigates are available.
LIMITATIONS OF SCHEDULE METHOD
Schedule method of data collection, like all other methods, has
limitations. Some of these are:
1. Costly and time consuming: As compared with
questionnaire, schedule method is generally costlier and
more time consuming. But this factor becomes a serious
limitation when the correspondents are physically scattered
over a wide area. To approach all of them is prohibitively
costly, besides involving excessive time
2. Requirement of a large number of welltrained field
workers: The schedule method requires very large number
of well trained and experienced field workers. Therefore is
becomes very difficult if not impossible, to hire a large
number of experienced workers. It involves great cost and
sometimes so many workers are not easily available and one
cannot depend upon in experienced hands.
3. Adverse Effect of Personal Presents: Where as the
personal presents proves helpful in assuring the response,
removing their doubts, it also becomes an inhibiting factor.
Many people can note down certain facts on papers but
cannot say them in presents of others.
4. Organizational Difficulty: If the field of research is
geographically wide, it becomes difficult to organize
research. To gather workers who are well acquainted with
various geographical region and different types of people is a
mammoth task. Though not beyond achievement, it is
certainly difficult.
From the above discussion, it is clear that the schedule
methods can be used successfully in a limited region and on
limited number of respondents.
DISTINCTION BETWEEN SCHEDULE AND QUESTIONNAIRE
In schedule there are a number of questions. In
Questionnaire also we collect information through a set of
questions. Some people remark that schedule is sort of
informal questionnaire. Prima facie, the two are more or less,
the same, but there is a vital difference between the two.
From the point of view of the objective, the schedule and
questionnaire are very much similar, but from the point of
view of methodology, re reliability tie etc; the two are
different from one another. The difference between two lies
in regard to the following points:
1. Methodology: The schedule is based on the direct
method. In it the researcher or investigators him selves
collects the information by asking questions, but in
questionnaire, the investigators make direct observation
in the field. Questionnaire method is an indirect method
of collecting the data. Geographical area covered through
Questionnaire method is generally wide. Since it is not
possible for an investigator to reach a very individual who
forms a subject of the research area, he resorts to
Questionnaire method.
2. Reliability of the information collected: It is said that
the information that is collected through schedule is more
reliable. It is a direct method so reliability is more
guaranteed. Even if some wrong information creeps in,
the researcher can verify it with the respondent. In
questionnaire method, the reliability is not guaranteed.
The respondent fills the questionnaire and mails or sends
it back. If anything has been left out, the interviewer or
researcher has no method to correct the incorrect
information.
3. Types of questions: Generally the questions, that are
included in the schedule, are short and pointed. Their
answers are sometimes limited to the use of the method
of saying yes or no. There is no possibility if irrelevant
and unnecessary questions being given place in the
schedule. This is not necessarily the case with the
questionnaire. Questions contained in the questionnaire
are such that require long. When the research topic is
wide and controlled observation is to be made, the
questions have to be very wide and detailed. When the
respondent gives detailed information only then it is
possible for the investigator or researcher to analyse the
information.
4. Factor of time and distance: In this method,
observation is not the sole criterion but as a result of the
exchange of thoughts between the interviewer and the
interviewed, the background is prepared. This is helpful in
removing the distance between investigator and the
respondent.
5. Distribution: there is one more difference between the
schedule method and questionnaire method from the
point of view of distribution. It is up to the respondent to
sent a complete questionnaire or not. He may not send a
complete questionnaire, but this is not thecase in
schedule method. The interviewer or investigator, if he is
tactful can persuade the respondent to furnish answers to
the questions in the schedule. In this method, chances of
poor or low response are less as compared to the
questionnaire method.
6. Clarification of questions: In schedule method, the
researcher can clarify the meaning or underlying idea of
the questions of the schedule to the respondents who are
not educated or are not intelligent enough, as to grasp
the underlying idea. That is why schedule method can be
used even when the respondents are illiterate or have
lower I.Q. In questionnaire method there is no such
possibility.
7. Use in sampling method: Schedule method can be
easily and safely used in sampling method of research.
Every person who is included in the sample, because of
the personal influence of the interviewer furnishes the
information correctly and reliability. Since the services of
interviewer are available for clarification and analysis, the
respondent or the sample is able to answer correctly and
give representative answers. Questionnaire method is not
useful for sampling method. Since the investigator is not
present, the respondent who has to furnish answers is
sometimes not able to answer all the questions correctly
and as required. Due to this limitation the data collected
is not representative.
8. Instrument Design: the schedule is generally framed
keeping in view the difficulties of the tabulators and field
workers. The questions in the schedule are so framed that
the field workers can have access to the sample. Unlike
questionnaire. The questions in the questionnaire are
framed not on the basis of the educational and economic
standards of the respondent. This sometimes creates
problem for the investigator or researcher when the stage
of the tabulation is reached.
QUESTIONS TO BE INCLUDED IN THE SCHEDULE
The basic thing while framing schedule is that those questions
should be included which reflected the nature of study and
problem. Normally the questions that ate included in the
schedule should have the following characteristics:
i) Questions should be shot, clearly worded, sample and not
difficult for the respondent.
ii) The questions should have a direct bearing on the
problem.
iii) The questions should be such that the information that is
collected through them can be subject to processing and
tabulation.
iv) The questions should be interrelated and they should be
such that cross checking may be possible.
v) Questions should be free from personal bias and they
should be such that injure the feeling of the respondents.
vi) Questions should be standardized and precise terms
should be used.
vii) Questions should be through and standardized and they
should be such that the respondent has to do the
minimum of effort. If the questions are cumbersome and
require too much pontification on the part of the
respondents they shall not invite accurate and easy
replies.
ORGANIZATION OF SCHEDULE
Once the schedule has been properly and scientifically framed,
the process of interview starts. It is through interview that
schedule is completed and the data collected. For this purpose,
the following steps has to be taken.
1. Selection of the respondents: The first thing that has to
be done after the framing of the scheduled is to select the
proper type and number of the informants or respondents.
Generally the sampling method is employed in the use of the
schedule. The sample selected be perfectly representative.
The sample having been selected, their names and addresses
should be legibly and correctly noted. This would enable the
field workers to approach them.
2. Selection, training and job of the field worker: In
schedule method, it is the field workers who carry on the
interview and collect data. Since there is a dearth of field
workers; they have to be selected according to the
requirements and characteristics of the study. They have
also to be trained accordingly. Apart from the training, they
should possess certain basic characteristics and to conduct
the study properly. The field worker has to possess the
following characteristics:
a) Honesty and integrity
b) Initiative and tactfulness
c) Patience
d) Smartness
e) Unbiased and scientific outlook
f) Interest in research area
g) Vigilant
h) Knowledgeable about the subject of study
i) Trained in technique and methods of study
3. Interview and correct replies: In schedule method, the
success very much depends upon the results of the
interview. If the field wworker has made a successful
interview, there is every likelihood that he has collected the
correct information. It requires the following things:
a) Correct approach: It means that the field worker should
approach the respondent in such a manner that he may
get the right expression from him. Generally he should
be approached when he is not busy or through some such
contact that he may not refuse to provide the
information.
b) Proper response: Proper response is the result of a
proper approach. Apart from it the proper response
depends upon other factors also. For this field worker
should be able to convince the respondent.
c) Correct reply: The field worker collects his data on the
basis of the answers given by the respondent. It involves
two factors, one is the correctness of the schedule,
second the proper approach to the respondent. For proper
response and correct reply, the respondent use probing
questions, but without injuring the feelings of the
respondents.
4. Testing the validity of the results: When the schedule has
been returned by the field worker, it should be subjected to
certain tests and checking so that it may be found out that
the data collected is correct. It can be done through various
ways. The investigator may himself, select certain
respondents and interviews he gained. In case the reply is
different, then what has actually been recorded on the
schedule, the whole lot should be either rejected or
subjected to study again. If there is sight variation, the
validity should not be doubted.
DOCUMENTED SOURCES OF DATA
Meaning of document: Document is a very important,
dependable and valuable source of information. Many
researchers have made use of this vital source. Document is
nothing but a written record that contains important information
about a problem or aspect of study. It may be a report, a diary,
letter, history, history, official and nonofficial records,
proceedings of the legislature, committees, societies, surveys,
journals, periodicals, speeches etc.
Types of Document: Strictly speaking it is very difficult to
classify the documents. All the documents have different types
and traits and elements in them. For the convenience of the
study they have been classified under the following two heads:
i. Personal documents
ii. Public documents
PERSONAL DOCUMENTS
These documents are recorded by the individuals. An individual
may record his view and thought about various problem. He
may do so because of his personal interest in those problems
and without knowing that these documents at a latter date may
form it a subject or source of study.
Types of personal documents: Personal documents may be
categorized or divided under the following heads for the
convenience of the study:
i) Life history
ii) Spontaneous autobiography
iii) Voluntary autobiography or selfrecord
iv) Letters
v) Memoires
In area of business research these personal document are of not
of much use except in case these pertain to business leaders.
Even in such a case they have limited use.
PUBLIC DOCUMENTS
Public documents are quite different from personal documents.
They deal with the matters of different interest. Public
documents may be divided into the following two categories.
i) Unpublished records: Such records, although they deal
with the matters of public interest, are not available to
people in published form. It means that everybody cannot
have access to them. Proceedings of the meetings, nothing
on the files and memoranda etc., form the category of
unpublished records. It is said that these records are very
reliable. Since there is no fear of their being made public, the
writers give out their views clearly.
ii) Published records: These records are available to people
for investigation and perusal. Survey reports, report of
enquiries and such other documents fall under this category.
The data contained in these documents are considered by some
people as quite reliable because the collecting agency knows
that it shall be difficult to test while others are of the view that
if the data are to be published , the collecting or publishing
agency does some windowdressing as a result of which the
accuracy is sometimes doubtful.
Most of the information that is now available to people and
researchers in regard to business environment, are to be found
in the form of reports. The reports published by government are
considered as more dependable. On the other hand some people
think that the reports that are published by certain private
individuals and agencies are more dependable and reliable.
STATE OF DOCUMENTED DATA IN INDIA
Published data in India: In India the bulk of data is published
by the government The central government or the state
government very few public institutions or individuals under
study publish data. In spite of it, it cannot be said that the data
published by public institutions, private agencies or individuals
are negligible. The data published by these agencies occupy
quite important place in the field of the study.
Types of the published Data in India: The data published in
India or the published data available in India deal with the
following aspects:
i) Data regarding population or demographic data: After
every 10 years the census of population is done in this
country. The last census was done in the year 2001 and on
the basis of that census data regarding various aspects of
the population have been collected and compiled.
Demographic data deal with the following aspects:
(a) Total population, (b) Sex ratio, (c) Rural and population
and the ratio between the two types of population and the
total of the area covered, (d) Family organization and set
up (e) Data regarding age and age ratio, (f) Data
regarding marriage and marriage pattern of different
groups 9G) Data regarding languages, dialects etc.,(h)
Data regarding percentage of literacy, (i) Record of
nationality, religion, tribes etc.
ii) Data regarding health: This type of data relate to the
following aspects of the health of the nation:
(a) Health, birth and mortality. (b) Causes of diseases,
death and mortality. (c) Rate of mortality in different age
groups. (d) Data regarding medical facilities etc.
iii) Data regarding socioeconomic aspect of life: The
data that have been published by the government or
other agencies in India in regard to socioeconomic
aspects of life of the individual or society deals with the
following aspects:
(a) Consumption pattern and food habits. (b) Statistics
regarding employment. (c) Statistics about economic
development.
Problems of the Data Published in India: We have already seen
that bulk of the data, available in the published from in India, is
the result of the efforts of the government agency. After
independence, an independent organization that is responsible
for collection of the data and statistics and its publication has
been set up. The Planning Commission and the Planning
Committee of the state government have their own
organizations. In order to fix the targets of plans and to
accomplish those targets within specified period every attempt
has been made to make the data accurate and reliable. This is
also true that data that are now available in published form are
reliable and accurate to a very great extent. In spite of it the
published data suffer from the following drawbacks:
a) Data about all the aspects of business and economic
activity are not collected.
b) Even Government of India do not have updodate and
latest data about many socioeconomic aspects.
c) The data collected lack in homogeneity and continuity.
d) Wrong notion about confidential data Government
offices have a very wrong notion about the confidential
data. They keep most of the collected data as
confidential. This defeats the very purpose of the
collection of data.
e) The method of publication is very defective. A good deal
of data remain on the file. Due to redtapism in the
government, the data are published so late that it
becomes useless to make a meaningful use of it.
f) The data collected by the Government Agencies are not
beyond doubt. This is due to the approach of the
administration and also because of the method of the
collection of data. The resources that are put at the
disposal of the machinery that is entrusted for the task of
the collection of data, are very meager.
Activity 6.10
If you were researching on economic development of India,
postindependence what kind of documents would you look into?
CASE STUDY METHOD
Case study method may be defined as small inclusive and
intensive study of an individual in which investigator brings to
bear all his skills and methods or as a systematic gathering of
enough information about a person to permit one to understand
how he or she functions as until of society. The case study is a
form of qualitative how he or she functions as unit of society.
The case study is a form of qualitative analysis involving a very
careful and complete observation of a person, situation or
institution.
Case study is a method of exploring and analyzing business
aspects of an industrial unit, even entire industry.
Characteristics of a case study: The important
characteristics of case study method are as under:
i) Study of a unit: The case study method studies a
subject matter which forms a cohesive, whole and may be
treated as a unit. The unit can be an individual, a family,
an institution.
ii) Intensive or InDepth Study: Case study attempts a
deep and detailed study of the unit. It is a method of
study in depth rather than breadth. It places more
emphasis on the full analysis of a limited number of
events or conditions and their interrelations.
iii) Knowledge of behaviour patterns: The case study
method deals with both what and why of the subject. It
tries to describe the complex behavioural pattern of a unit
and having done this, tries to discover the factors which
will rationally account for them. In brief, case study
method aims at description as well as explanation of the
unit it studies. It also explains the place and role of a unit
in its surrounding social milieu.
iv) The study of the whole unit: The case study method
tries to perceive the unitary forces of the subject matter
and organizes it into an integral whole.
Basic Assumptions of the case study method: Following are the
basic assumptions of the case study method.
a) Totality of the being: In this method, the unit of the
study which may be an individual or a business group is
studied as a unit. This study is confined to a particular
time or situation but it is in totality or in all its aspects.
b) Underlying Unit: It is believed that a unit is the
representative of a type and it should be studied as a
type rather than as an individual unit. This assumption
involves that the units are the same and there is no
difference in studying a particular unit.
c) Complexity of the business environment: This study
is based on the assumption that the business
environment is a very complex affair and so deeper study
is required. Therefore the case data are gathered of an
entire life cycle of an industrial product or unit.
d) Influence of the time factor: Business environment
gets influenced by the time factor, as it is dynamic in
nature. Accordingly, the study can not be worthwhile
unless it is long range and over considerable period of
time so that institutions and groups in their various
aspects may become well known.
e) Similarity of response in human beings: The case
study method believes that the fundamental responses
from human beings will be more or less the same.
f) Resources or circumstances: Besides believing in the
fundamental of consistency of human nature, it also
believes that the business conditions and circumstances
tend to recur from time with marginal changes.
Advantages of case study method: The case study method is
very popular method of collecting data about industrial units and
industries as a whole. The main advantages of case study
methods are as follows:
i) Intensive and deep study of the problems is
possible: The case study method enables one to
understand fully the behaviour pattern of the concerned
unit. In this method the problem is recognized as a unit
and various aspects of the problem are subjected to deep
and detailed study.
ii) Study of the subjective aspects: Through case study a
researcher can obtain a real and enlightened record of
specific experiences, which would reveal a business units
problems and motivation that drives the unit to adopt a
certain pattern of behaviour.
iii) Comparative Study: In this method, all the aspects of
life of an industrial unit are studied. Through this study
the characteristics of one particular industry or firm may
be differential from the characteristics of an other. Thus
the method is helpful in comparative study.
iv) Formulation of valid hypothesis: Once the various
cases are extensively studied and analysed, the
researcher can deduce various generalizations, which may
be developed into useful hypothesis.
v) Useful in framing in questionnaire and schedules:
Case study is of great help in framing questionnaire,
schedule and other forms. This in turn helps in getting
prompt response.
vi) Sampling: Case study is helpful in stratification of the
sample. By studying the individual units the researcher
can put them in definite classes or types and thereby
facilitate the perfect stratification of the sample.
viii) Study of process: In cases where the problem under
study constitutes a process and not one incident, e.g.
merger process, cartel formation etc., the case study is
the appropriate method.
viii) Use of several research methods: The researcher can
use one or more of the several research methods under
the case study method depending upon the prevalent
circumstances.
Limitations of the Case Study Method
i) Unrealistic assumptions: Case study method is based
on several assumptions which may not be very realistic at
times, and as such the usefulness of case data is always
subject to doubt.
ii) Problem of finance, time and energy: Through this
method, it is not possible to cover the large area for
study. It requires large finance time and energy to
complete the study.
iii) False generalization: Generalization are formulated on
the basis of data collected. If the data collected are
wrong, the generalization shall be wrong.
iv) Difficult to test reliability of the validity of the data
collected: This study is based on the information that is
given by an individual or a firm. If the information
furnished is wrong, there is no method on the basis of
which researcher can test the reliability of the
information.
v) Not possible to apply sampling method: In this
method individual is recognized as a unit and the entire
study is concentrated on that unit. It is necessary that the
unit may be representative of the universe. Through the
study of a particular unit, it is not possible to apply the
knowledge of other units.
vi) Defective records: If the records on the basis of which
data are collected are defective and not based on
objectivity the generalization becomes defective.
vi) Lack of quantitative study: It is not possible t6o
quantify the feelings, emotions, reactions, values etc.,
and as such it lacks scientific temper.
PROCEDURE OF CASE STUDY
The following four stages are followed in case study method:
i) Statement of the problem: In this process, the problem is
selected and studied. It includes the following subjects.
a) Selection of the cases
b) Types of the units
c) Number of cases
d) Scope of analysis
ii) Description of the process event: In the case study
method, the process event is to be studied. It requires that
the investigator should describe this process.
iii) Determinants or factors: In every event or episode
there are certain factors that are responsible for that
event. In case study method, these determinants or
factors are also studied. These factors may be classified
under the following two heads.
a) General factors
b) Specific factors
iv) Analysis and conclusions of the factors: This is the
final study stage. The factors and analysis and
conclusions about the role of these factors are recorded.
Activity 6.11
How would you deploy Case Study method for researching into
deployment of CRM software in logistics companies?





SUMMARY
In this unit the definition and the characteristics of data have
been explained. The differences between primary data and
secondary data have been highlighted. The advantages,
disadvantages of primary and secondary data have been
described. The factors affecting choice of data have been dealt.
The important factors to be considered in statistical
investigation have been explained.
The various methods of data collection i.e., observation method,
personal interview. By forms, schedules or questionnaire
method, case study method, library method, documentary
sources of data method have been explained. Their relative
advantages, limitations, applications, procedure and precautions
in use have been described in detail.
REVIEW QUESTIONS
Q. 1 Define the term data and state its characteristics.
Q. 2 What is primary data? What are the advantages and
limitations of primary data?
Q. 3 What is secondary data? Distinguish between primary
data and secondary data.
Q. 4 State the advantages and disadvantages of secondary
data
Q. 5 State the essential characteristics of secondary data.
Q. 6 Explain the sources of secondary data.
Q. 7 Explain the factors that affect the choice of data, viz
primary or secondary, to be used in investigation.
Q. 8 Define the term observation method of data collection.
State its characteristics merits and demerits.
Q. 9 Explain the types of observations and state their merits
and limitations.
Q.10 Distinguish between
a) Uncontrolled v/s controlled observations.
b) Participant observations v/s non participant
observation.
Q.11 What are the techniques used in controlled observations.
Q.12 Explain how the organization of field observation is done.
Q.13 Explain the interview method of data collection. State its
characteristics, and aims.
Q.14 Describe the importance or significance of interview
method in business research.
Q.15 Explain the merits and limitations of interview method.
Q.16 Distinguish between
a) Formal and Informal interview
b) Personal interview and group interview
c) Diagnostic interview and treatment interview
d) Qualitative interview and Quantitative interview
e) Short contact interview and long contact interview
f) Non directive interview and focused interview.
Q.17 Write Short Notes
i) Research interview
ii) Repeated interview
iii) Preparation for interview
iv) Interview guide
v) The Interviewers Role
Q.18 What are the causes of arising errors in interview
method? Explain the means of avoiding arising of errors
in interview method.
FURTHER READINGS
1. Cooper, Donald R. & Schindler, Pamela S. (2006).
Business Research Methods, Tata McGraw Hill Compnies;
India.
2. Kothari, C.R. (2004). Research Methodology Methods and
Techniques,New Age International (P) Limited: New Delhi.
3. Jhunjhunwala, Bharat (2008). Business Statistics, S
Chand & Co. New Delhi.
4. Beri, G.C. (2009). Business Statistics, IIIrd Ed. Tata
McGraw Hill Pvt. Ltd.; India.
CHAPTER 7 SAMPLING AND SAMPLING
DESIGN
Objective
After reading this chapter, the learners would be able to:
 Understand the concepts and theory behind sampling.
 Distinguish between census and sampling methods of
data collection.
 Know about various methods of sampling and their
reliability.
 Determine the sampling size in various research
situations and appreciate various sampling and non
sampling errors that creep in.
Structure
 Basic Definitions
 Laws of Sampling
 Theory of Sampling
 Scope of Census Method of Data Collection
 Scope of Sampling Method of Data Collection
 Methods of Sampling
 Reliability of the Sampling
 Size of Sample
 Determination of Sample Size
 Sampling And NonSampling Errors
 Summary
 Review Questions
 Further Reading
BASIC DEFINITIONS
SAMPLING
Sampling may be defined as the selection of some part of an
aggregate or totality on the basis of which a judgment or
inference about the aggregate or totality is made. Sampling is
simply the process of learning about the population on the basis
of a sample drawn from it. Thus in the sampling technique
instead of every unit of the Universe only a part of the Universe
is studied and the conclusions are drawn on that basis for the
entire Universe.
UNIVERSE/POPULATION
From statistical analysis point of view, we often refer to two
terms viz, universe and population. The term universe implies
the total of all the items or units of analysis in the field of
proposed research. On the other hand, term population implies
the total number of items for which information is sought. In
research parlance, the units of research which possess the
attributes required to achieve the research objectives are known
as elementary units. The aggregation of such elementary unit is
termed as population. Therefore, it is implied that all units of
any research project will constitute universe, and all the
elementary units (defined on the basis of research objective
attributes) constitute population.
Universe  Finite and Infinite: If the number of units or
items or members in a universe is definite and fixed it is a Finite
Universe. For example, number of feature films produced in the
world is finite and can be known. The number of districts in
India is another example of finite universe. If the number of
units or items is not fixed and finite in a universe it is known as
Infinite Universe. The examples are the number of stars in the
sky, water in the sea etc.
Universe  Existent and Hypothetical: An Existent Universe
is one which comprises of tangible items or units, the number of
books in a library, the number of companies paying corporate
tax. A Hypothetical Universe is one which is not existent. It is
imaginary. The examples of hypothetical universe are the moves
on a chess board. All possible and conceivable moves of chess
can never be realized. They can only be imagined.
SAMPLE
A sample is that part of the universe which we select for the
purpose of investigation. A sample exhibits the characteristics of
the universe. The word sample literally means small universe.
For example, suppose the microchips produced in a factory are
to be tested. The aggregate of all such items is universe, but it
is not possible to test every item. So in such a case, a part of
the universe is taken and then tested. Now this quantity
extracted for testing is known as sample.
COMPLETE ENUMERATION OR CENSUS
If detailed information regarding every individual person or item
of a given universe is collected then the enquiry will be called
complete enumeration. Another common name of complete
enumeration is census. For example, in India, census
department conducts population census after every ten years.
SAMPLING FRAME
The term sampling frame or population frame refers to the
listing of all items in the population with proper identification
under study. For example, if we want to find out the capital
invested and number of workers working in small scale
industries in Bhopal, we must have a complete list of names and
addresses of all the small scale firms. The list of names and
addresses will be called sampling frame.
SAMPLING DESIGN
A sampling design is a definite plan for obtaining a sample from
the sampling frame. It refers to the technique or procedure of
selecting some sampling units from which inferences about the
population are drawn.
STATISTICS AND PARAMETERS
A statistic is characteristic of a sample and a parameter is the
characteristic of universe or population.
SAMPLING ERRORS
In sample survey only a small part of the universe or population
is studies, as such there is every possibility that its result would
differ from each other. These differences constitute the errors
due to sampling and are known as sampling errors.
PRECISION
Precision is the range within which the population average for
other parameter will lie in accordance with the reliability
specified in the confidence level as percentage of the estimate
or as a numerical quantity. For example, if the estimate is Rs.
5000 and the precision desired is 4% then the true value will
be no less than Rs. 4800 and no more than Rs. 5200.
CONFIDENCE LEVEL AND SIGNIFICANCE LEVEL
The confidence level which is also termed as reliability is
expressed in terms of percentage of times that an actual value
will fall within the prescribed precision limit. For example, if we
have to consider a confidence level of 90% which will imply that
if we repeat a particular exercise 100 times, 90 times the
parameters of population under consideration will lie within the
prescribed limit. The significant level, on the other hand,
indicates the likelihood of the observation falling outside the
prescribed range. Therefore, if the confidence level (CL) is 90%,
then significance level (SL) would be 10%. If the confidence
level is 98%, then significance level is 2%. Therefore,
significance level can mathematically be expressed as, SL % =
100%  CL%.
SAMPLING DISTRIBUTION
If we take certain number of samples and for each sample
compute various statistical measures such as mean, standard
deviation etc., then we can find out that each sample may give
its own value for statistics under consideration. All such values
of a particular statics, say, mean together with their relative
frequencies will constitute the sampling distribution of mean
standard deviation etc.
Activity 7.1
What could be population and sampling unit in following
research propositions
Study of literacy rate in a State






Disposable Income among families with both husband and wife
earning and having no kids.






Annual yield of rice in paddy fields of Western UP?






Measurement of population density in a metropolitan town?
[Type text] Page 253




DIFFERENCE BETWEEN POPULATION AND CENSUS
Population Sample
Definition Collection of items in
totality being considered
for study
Part or portion of
the population is
chosen for study
Characteristics Parameter Statistics
Symbols Population Size = N
Population Mean =
Population S.D. =
Sample size = n
Sample Mean =
Sample S. D. = S
LAWS OF SAMPLING
Two fundamental principles on which the sampling theory rests
are:
a) The law of statistical regularity
b) The laws of inertia of large numbers.
LAW OF STATISTICAL REGULARITY
Definition: The law states that if a moderately large number of
items are selected at random from a given population, the
characteristics of those items will reflect, to a fairly accurate
degree, the characteristics of the entire population. For
example, if 500 leaves are picked from a tree at random and the
average length is found out, the result will be nearly the same
as will be found if all the leaves of the tree are picked up and
measured.
RELIABILITY OF THE LAW
The reliability of the Law of Statistical Regularity depends on
two factors:
i) The larger the sample, the more reliable are its
indications. The reliability of a sample is proportional to
the square root of the number of items it contains and
larger the sample the more representative and stable it
will be.
ii) The sample must be chosen at random.
CHARACTERISTICS OF THE LAW
The main characteristics of the law are:
i) With the use of this law, a part of the universe can
represent it. Thus, when census is not possible due to
paucity of time, money and labour, then with the help of
this law and by using random sampling, investigations
can be made.
ii) If selection is made at random, then by this law, good,
bad and average, all units have equal chance of being
selected.
iii) With the help of this law, inferences drawn from a
particular enquiry for different time and place can be used
for all other places with little adjustments. For example, if
the rate of growth of population in Delhi is 10% per
annum, then, there is a probability that in future also the
rate of growth would remain the same.
LIMITATIONS OF THE LAW
The main limitations of the law of statistical regularity are as
follows:
i) Selection of units should essentially be unbiased.
ii) The inferences drawn from this are applicable on an
average to other units of the universe.
iii) Sample should be identical to universe. By collective
information from smaller number of units, we cannot
apply the results drawn from it to the whole universe.
UTILITY OF THE LAW
The law is very useful in the following two cases:
i) Sampling methods, and
ii) Interpolation and Extrapolation.
In the word of W. I. King, It is upon this principle that
gambling joints are enabled to run continuously and
profitably with only small odds in their favour.
Activity 7.2
How do casinos utilize this principle to ensure profitability over a
period of time. Comment





LAW OF INERTIA OF LARGE NUMBERS
Definition: The Law of Inertia of Large Numbers is a corollary
of the Law of Statistical Regularity and lays down that (in large
masses of data abnormalities will occur, but in all probability,
exceptional items will offset each other, leaving the average
unchanged subject, where the element of time enters, to the
general trend of the data). According to King, The Law of
Inertia of Large Numbers asserts that large aggregates are
relatively more stable than small ones. The movements of an
aggregate are the resultants of the movements of its separate
parts; and it is improbable that the latter will all be moving in
the same direction at the same time. Consequently, their
movements will tend to compensate one another, and the larger
the number involved, the more complete will this compensation
be. Thus, the law states that the larger the number of items we
take out of given universe, the greater is the probability of
accuracy. The law is based on the fact that if one part of a large
group varies in one direction, the probability is that another
equal part of the same group would vary in the opposite
direction, so that the total change would be insignificant. For
example, a study of wheat production might reveal substantial
variation from yeartoyear as regards a particular region but
the total yield of wheat in India for those very years may show
only a slight variation.
Law of inertia of large numbers has a very great utility in
Statistics as well as daytoday life. It is this constancy of great
numbers that makes measurement possible. Both the Law of
Statistical Regularity and the Law of Inertia of large numbers
are based upon experience. They may be exemplified by
insurance business. The insurance companies collect large
masses of data from past experience, on the basis of which they
charge the premiums. According to W.A. Neiswanger, while
prediction of behavior of individuals is unrewarding, the typical
behavior of the mass of individuals is clearly revealed and is
reasonably stable from community to community.
Activity 7.3
How is Law of Inertia of large number rise to insurance
companies in formulating their policies for premium?






THEORY OF SAMPLING
The theory of sampling emanates from the study of
relationships existing in a population, compared to those in
sample drawn from that particular population. In a sense, this
theory is developed with a purpose of estimating the properties
of a population from those obtained by studying a specific
sample. It also deals with finding out the preciseness of the
estimate. It is important to have random sample for this theory
to be valid. The methodology of drawing inferences of the
universe from random sampling is known as theory of sampling.
Sampling theory deals with:
a) Statistical estimation: Sampling theory helps in
estimating unknown population parameters from a
knowledge of statistical measures based on sample
studies. The estimation can be a point estimate or it may
be an interval estimate. Point estimates is a single
estimate in the form of a single figure but interval
estimate has two limits viz., the upper limit and lower
limit within which the parameter value may lie.
b) Testing of hypothesis: The second objective of the
sampling theory is to accept or reject the hypothesis. One
great help that sampling theory provides is in observance
of differences in estimation are due to chance and if they
are of any real significance.
c) Statistical Inferences: Sampling theory uses statistical
underpinnings to make a generalized statement about a
specific population extended from the studies conducted
on sample obtained from it. This is what is known as
statistical inferences about a particular population and
accuracy thereof.
SCOPE OF CENSUS METHOD OF DATA
COLLECTION
When all the items of population under investigation are studied
and then conclusion drawn is called Census. But in the
technique of sample investigation certain units from the whole
domain of survey are selected as being representative. Now
these are studied in details and the conclusions arrived from
these are extended to the entire field or domain. Unlike census
investigation, not all the units are studied in sample
investigation but only some of these are selected for study on a
certain definite basis. For example, if we want to study the
monthly expenditure of the students of a university, we may not
study all the students. We may collect figures of about 5% of
them only. Supposing there are 10,000 students, then we may
collect expenditure figure of only 500 students and extend our
conclusion to all of them. If full precaution is taken in selection
of representative students and data are collected faithfully, the
applicability of these conclusions to the entire set will be of very
high reliability.
SUITABILITY OF CENSUS METHOD
The census method of data collection is suitable under following
conditions:
i) The area of enquiry is limited, and not very vast.
ii) There is enough time for collecting data.
iii) Higher degree of accuracy is required.
iv) There is enough finance available to meet the expenditure
on the collection of data.
v) Suitable only when units are heterogeneous units.
vi) This is the only method applicable, for example, census of
population of a country.
Activity 7.4
Can census method be applied for following situations of Data
Collection: (a) Mobile numbers of all students in a class size of
100 (b) Number of Lions in Gir Forest (c) Number of Hilsa Fishes
in river Ganges.




SCOPE OF SAMPLING METHOD OF DATA
COLLECTION
Scope/Area of application of Sampling: Over the years,
sampling technique has emerged as a great tool for all
researchers. It is of particular use in quantitative research, in
the domains of:
Education
Economics
Commerce
Science
Sampling technique is of great importance even in routine day
today activities. For example, if we are thinking of buying
potatoes from a vegetable vendor, we will just pick one or two
pieces of potatoes and will examine them to understand the
quality of potatoes being sold. Another example is a whole
tanker of gasoline which is accepted or rejected based on
sample testing of few milliliters of gasoline. Similarly, the
doctors make inferences about a patients health through blood
test, from a single drop of blood. Therefore, sampling is a good
tool to know about large population by observing a few units. In
fact, census technique is hardly ever used. The only use it has is
for the population count of a country.
FEATURES OF SAMPLING METHOD
The sampling technique has following good features and these
bring into relief its value and significance.
1. Economy: Sampling technique brings about cost control
of a research project, as it requires much less physical
resources as well as time than the census technique.
2. Reliability: In sampling technique, if due diligence is
exercised in the choice of sample unit and if the research
topic is homogenous then the sample survey can have
almost the same reliability as that of census survey.
3. Detailed Study: An intensive and detailed study of
sample units can be done, since their number is fairly
small. Also, multiple approaches can be applied to a
sample for an intensive analysis.
4. Scientific Base: As mentioned earlier, this technique is
of scientific nature as the underlined theory is based on
principle of statistics.
5. Greater Suitability in most Situations: It has a wide
applicability in most situations, as the examination of few
sample units normally suffices.
LIMITATIONS OF SAMPLING
1. Less Accuracy: In comparison to census technique the
conclusions derived from sample are more liable to error.
Therefore, sampling technique is less accurate than the
census technique.
2. Changeability of Units: If the units in the field of survey
are liable to change or if these are not homogenous the
sampling technique will be very hazardous. It is not
scientific to extend the conclusions derived from one set
of sample to other sets which are dissimilar or are
changeable.
3. Misleading Conclusions: If due care is not taken in the
selection of samples or if they are arbitrarily selected, the
conclusions derived from them will become misleading if
extended to all units. For example, in assessing the
monthly expenditure of university students we select for
sample study only rich students, our results will be highly
erroneous if extended to all students.
4. Need for Specialized Knowledge: The sample
technique can be successful only if a competent and able
scientist makes the selection. If this is done by average
scientist, the selection is liable to be wrong.
5. When Sampling is not Possible: Under certain
circumstances it is very difficult to use the sampling
technique. If the time is very short and it is not possible
to make selection of the sample, the technique cannot be
used. Besides, if we need 100% accuracy the sampling
technique cannot be used. It can also not be used if the
material is of heterogeneous nature.
CHARACTERISTICS OF IDEAL SAMPLE
A good sample has following qualities:
1. Representativeness: An ideal sample must be such that
it represents adequately the whole populations. We would
select those units which have the same set of qualities
and features as are found in the whole data. It should not
lack in any characteristic of the population.
2. Independence: The second feature of a sample is
independence, that is interchangeability of units. Every
unit should be available to be included in the sample.
3. Adequacy: The number of units included in a sample
should be sufficient to enable derivation of conclusions
applicable to the whole population. A sample having 10%
of the whole population is generally adequate.
4. Homogeneity: The units included in sample must bear
likeness with other units, otherwise sample will be
unscientific.
Activity 7.5
How would you judge the following sample based on criterion of
Ideal Sample, 20 students from a class of 100






METHODS OF SAMPLING
Methods of sampling can be grouped under two board
categories:
i) Probability sampling methods or random sampling
methods.
ii) Nonprobability sampling methods or nonrandom
sampling methods.
PROBABILITY SAMPLING METHODS
Probability sampling methods are those in which every item in
the universe has a known chance or probability of being chosen
for the sample. However known chance does not mean equal
chance.
Advantages of Probability Sampling: The following are the
advantages of this type of sampling methodology.
i) Probability sampling does not require detailed
information about the universe, to be effective.
ii) Probability sampling provides estimates which can
be measured precisely and are inherently unbiased.
iii) It is possible to evaluate the relative efficiency of
various sample designs when probability sampling is
used.
Limitations of Probability Sampling: The limitations of this
method are:
a) It requires a high degree of skill level and expertise.
b) It requires a lot time to plan and determine a probability
sample.
c) The costs involved in probability sampling are higher as
compared to nonprobability sampling.
Types of Probability Sample: These are:
i) Simple Random Sampling (SRS)
ii) Stratified Sampling
iii) Systematic Sampling
iv) Multistage Sampling
SIMPLE RANDOM SAMPLING
Sampling procedure is known as simple random sampling where
the individual units constituting the sample are selected at
random. Random sampling method of selection assures each
individual element or units in universe and equal chance of
being chosen. In other words, if in a sample size of n, all the
possible combinations of n elementary items have the same
probability of being included; it is called simple random
sampling.
Simple random sampling is of two types:
a) Sampling with equal probability
i) with replacement, and
ii) without replacement.
b) Sampling with probability proportional to size of
the sample unit.
Selecting a Random Sampling: A random sample can
generally be selected in the following four methods:
i) Lottery Method
ii) Tippets number of method
iii) Selection from sequential method
iv) Grid system
i) Lottery Method: In this method, a lottery is
drawn by writing the numbers or the name of various
units and putting them in a container. They are
thoroughly mixed and certain numbers are picked up
from the container, and those which are picked up are
taken up for sampling.
ii) Tippets Number Method: It is called Tippets
numbers method because it was evolved by L.H.C.
Tippet who constructed a list of 10,400 fourdigit
numbers written at random. From these numbers it is
not very difficult to draw samples at random. For
example, if 50 persons are to be selected for study out
of the total number of 500, then we can open any
page of Tippets numbers and select first 50 random
numbers that are below the value of 500 and take
them up for study. On the basis of the experiments
carried out through this technique, it has been found
that the results that are drawn on the basis of this
method or random sampling, are quite reliable.
iii) Selection from Sequential List: In this method,
the names are arranged serially according to a
particular order. The order may be alphabetical,
geographical or only serial. Then out of the list any
number may be taken up. Beginning of selection may
be made from anywhere. For example, if we want to
select 10 persons, we can start right from the 10
th
and
select 10, 20, 30, 40 and so on.
iv) Grid System: This method is generally used for
selecting the sample of an area and so on in this
method, a map of the entire area is drawn. After that
a screen with squares is placed upon the map and
some of the squares are selected at random. Then
screen is placed upon the map and the area falling
within the selected squares are taken as samples. This
is today possible by electronic means.
Activity 7.6
You are trying to estimate the forest cover in the Delhi NCR
region, which type of random sampling method would you
choose and why?





Precautions in Random Sampling: The following precautions
should be taken in random sampling:
a) The universe or the population to be sampled
should be clearly defined.
b) List of all the units that are available for the
purpose of selection, should be prepared.
c) Units to be selected that are ready for the
purpose of selection should be approximately of equal
size.
d) These units should be independent and not
dependent upon one another. It means that if one unit is
selected, it should not be necessary to select another unit
for the sake of complete information.
e) Every unit should be accessible.
f) Replacement of the selected unit should not be
done.
Advantage of Random Sampling Method: The method of
random sampling is simple and said to be a very easy form of
the method employed for the study of business research
problems. That is why Ackoff has remarked:
Simple random Sampling, in a sense, is the basic form of all
scientific sampling.
In this method, the investigator can keep himself away from
prejudices, bias and other elements of subjectivity. It has the
following advantages:
a) It is quite simple, and follows a mathematical
procedures.
b) It is free from bias and prejudices.
c) It is said to be more representative because in
this method, each unit has equal chance of being
selected.
d) In this method if some error has crept in, it will
not be difficult to detect in case the sampling has been
done strictly according to Random Sampling.
Disadvantage of Random Sampling Method: Although the
random sampling method has some advantages, it suffers from
certain demerits as well. These demerits are:
a) Selection according to strictly random basis
is not possible: Sometimes instead of random sampling,
we resort to substitution of selected sample which vitiates
the whole procedure.
b) Lack of control of research: In this method,
the researcher has no control over the selection of the
unit. Therefore, the units selected may be widely
dispersed and possibility of contact may become a
problem.
c) Random sampling does not suit
heterogeneous groups: Random sampling is an useful
method if all the units are heterogeneous, in case units
are heterogeneous in nature, the Random Sampling
method may not be very useful.
Is Random Sampling Superior to Other Methods?
According to C.P. Babu, Random Sampling is no cure for all
evils. The only statement that can be made is that through
probability sampling methods the chances of getting a
representative sample are quite high. It is possible to have a
representative sample resulting through a nonprobability
sampling selection and a nonrepresentative sample resulting
through a probability sampling selection. However, at best
these can be exceptions. Such exceptions should not become a
basis for the use of nonprobability sampling procedures. Any
business research study should employ a scientific sampling
procedure that is random sampling method.
STRATIFIED SAMPLING METHOD
In this method, the universe or the entire population is divided
into a number of groups or strata in such a way that the
elements within the strata are homogenous where as between
strata there a heterogenity. Further stratas should be mutually
exclusive and collectively exhaustive. Since the method deals
with strata therefore it is known as Stratified Sampling. In this
technique, the whole universe is divided into various groups,
certain number of items are taken from each group at random.
While selecting the units at random out of different strata,
selection is done with a definite purpose or with a deliberate
intention. Although the selection is done with a purpose, it is
done at random. Thus a stratified sample is equivalent to a set
of random samples on a number of subpopulations.
Importance of Strata: The method very much depends upon
the process of stratification. If correct stratification is done then
the method shall be successfully employed. Stratification should
be done with following kept in mind:
a) Each stratum in the universe should be large
enough in size so that selection of items may be done on
random basis.
b) There should be a perfect homogeneity among
different units of stratum.
c) The ratio of number of items to be selected from
each unit of strata should be the same as the total
number of the units in the strata bearing the units of the
entire universe.
d) Stratification should be welldefined and clear
cut. It means that each unit stratum should be free from
influence of the other.
Activity 7.7
If you are studying the luxury car brands (BMW, Mercedes, Audi
etc) what kind of strata you would choose among the universe
of car owners?






TYPES OF STRATIFIED SAMPLING:
a) Proportionately stratified sampling: In this kind of
stratified sampling the number of units should be drawn
from each strata in proportion to size of the strata.
b) Disproportionate stratified sampling: In this type of
stratified sampling an equal number of cases are taken
from each stratum without any consideration to the size
of strata in proportion to universe.
c) Stratified weight sampling: In this method, an equal
number of units are selected from each stratum and
averages are drawn, but in doing so they are given
weight in proportion to the size of the stratum in relation
characteristics. The merits of stratified sampling are
described below:
Merits of the Stratified Sampling Method: Stratified
sampling method has certain merits. Ackoff has rightly said:
Stratified sampling enables the researcher to make a
comparison of properties of the strata as well as to estimate
population characteristics. The merits of stratified sampling
are described below:
a) Greater control of the investigator: In this method,
the investigator has greater control over the selection of
the samples. In random sampling although every group
has a chance of being selected in the sample but there is
every possibility but it can so happen that certain
important groups are left unrepresented, but in stratified
sampling no important group is likely to be left out.
b) Easy to achieve representative character: In this
method it is possible to achieve representative character
with fewer items. In case of homogeneity in stratum,
selection of few units fulfill the representative character.
c) Replacement of units is possible: Normally in random
sampling particular unit is not accessible for study, it is
difficult to replace it by another but in stratified sampling
replacement of an inaccessible case by an accessible case
is possible. Stephen has said: By providing that fixed
proportion of the sample shall come from each geographic
area or income class, stratification automatically brings
about a replacement of persons lost to the sample by
persons of the same stratum. Thus partly correcting the
bias would result if there were no replacement of losses.
Demerits of Stratified Sampling Method: The disadvantages
of stratified sampling method are described below:
a) Possibility of bias: In this method if a stratification has
not been done properly there is every possibility of
bias creeping in.
b) Difficult to attain proportion: It is very difficult to
attain proportion through design. In random sampling
it is achieved automatically. Attainment of proportion
becomes particularly difficult when there is wide
variance in size of different strata.
c) Difficulty in making the sample representative: In
disproportionate type of stratified sampling, the
element of weighting introduces the factor of
selection. If underweighting has been done, the
sample becomes unrepresentative.
d) Difficulty in placing cases under stratum: If the
stratas are not very clearcut, it may not be easy to
decide in which unit or stratum a particular case is to
be placed. This upsets the applecart of this method.
SYSTEMATIC SAMPLING
Systematic sampling is a variation of simple random sampling.
It requires the universe or a list of its units may be ordered in
such a way that each element of the universe can be uniquely
indentified by its order. A voter list, a telephone directory, a
card index system would all generally satisfy this condition.
Suppose there are 5000 cards (and hence, 5000 units of the
universe), and we want a sample of 500. We can select a
number between (and including) 1 and 10 at random, say 8.
Then we can select the units whose cards are in the following
position: 8, 18, 28, 38, .928,1008, ..4998. This would be a
systematic random sample or commonly known as a systematic
sample.
SAMPLING INTERVAL/RATION
Use natural ordering of universe select random starting point
between 1 and the nearest integer to the sampling ratio, select
items at intervals of nearest integer to sampling ratio.
Mathematically,
K =
Where K = Sampling ratio
N = Universe size
n = Sample size
ADVANTAGE OF SYSTEMATIC SAMPLING
The main advantage of systematic sampling are the following:
i) Simplicity in drawing a sample, which is easy to check.
ii) Except for populations with periodic behaviour, systematic
sampling variances are often somewhat smaller than
those for alternative procedures.
iii) If population is ordered with respect to a relevant
property, giving a stratification effect, this helps in
reducing variability compared to simple random sampling.
[Type text] Page 276
The following are the main disadvantages of systematic
sampling:
i) If sampling interval is related to a periodic ordering of the
universe, increased variability may be introduced.
ii) Estimates of error likely to be high where there is
stratification effect.
In practice, it is essential to use systematic sampling only
when one is sufficiently acquainted with the data to be able
to demonstrate that periodicities do not exist, or sampling
interval is not a multiple or sub multiple of the period.
MULTISTAGE SAMPLING
This procedure is generally used in selecting a sample from a
very large population (universe). As the name suggests in multi
stage sampling, the selection of sample is made in different
stages. The original units into which the universe is divided are
primary units. Each primary unit that falls into the sample is
subdivided into secondary units in preparation for the second
stage of sampling. In 3stage sampling, there will be primary,
secondary and tertiary units. Sometimes even four stages are
used.
BRIEF DESCRIPTION OF STAGING PROCESS
Use a form of random sampling in each of the sampling stages
where there are more than two stages.
For example, we want to take sample from the population of
Teachers of Punjab University. The Punjab University provides
us with a list of names. The list consists of, say, 400 pages with
approximately 20 names per page. These pages are numbered
and constitute the primary sampling units. The names are not
numbered but are arranged alphabetically; they constitute the
secondary sampling units. Let us suppose we want a sample of
100 Teachers. The sample may be selected as follows. We may
decide to select 5 Teachers from 20 pages. Select a number
from 1 to 20 at random, say 8. Select pages 8, 28, 48 and
so on. Then, by the use of random numbers select 5 names
from each of the 20 pages. This is a combination of systematic
and simple random sampling. Theoretically, we can use the
reverse procedure also.
The variability of estimates yielded by multistage sampling may
be greater than that of estimates yielded by simple random
sampling for equal size. The variability of estimates in
multistage sampling depends on the composition of primary
units. There are reservations in recommending this method:
a) Control of the nonsampling errors would be difficult and
costly, and
b) A probabilitysample of small units drawn at one stage,
requires a frame which lists all the small units and such a
plan become costly beyond reasons.
MERITS OF MULTISTAGE SAMPLING:
i) A complete listing of the universe is not required.
Sampling lists, identification and numbering are required
only for sampling units selected in sample.
ii) If sampling units are geographically defined, then, it cuts
down field costs, of travel etc.
DEMERITS OF MULTISTAGE SAMPLING
i) Errors are likely to be larger than in simple random
sampling or systematic sampling for similar sample size.
ii) Errors increase as number of selected sampling units
decreases.
Activity 7.8
In a satisfaction survey in Gurgoan regarding working of their
municipal corporation, how would you go about using multistage
sampling to select a reasonable sample size?






NONPROBABILITY SAMPLING METHODS
Nonprobability sampling methods are those which do not
provide every item in the universe with an equal chance of
being included in the sample. The selection process is partially
subjective. Nonrandom sampling is process of sample selection
without the use of randomization. In other words, a nonrandom
sample is selected on a basis other than probability
consideration such as convenience, judgment etc.
Nonprobability sampling methods are:
i) Judgment or purposive or deliberate sampling
ii) Convenience sampling
iii) Quota sampling.
JUDGMENT SAMPLING
In this method of sampling the choice of sample items depends
primarily on the judgment of the researcher. In other words, the
researcher determines and includes those items in the sample
which he thinks are most typical of the universe with regard to
the characteristics of research project. For example, if sample of
100 Teachers is to be selected from university having 500
teachers for analyzing the spending habits of teachers, the
researcher would select 100 teachers who, in his judgment, are
representative of the university.
Merits: This method is sometimes used in solving many types
of economic and business problems. The use of judgment
sampling is justified by following premises:
i) If there are a small number of sampling units is in the
universe, judgment sampling enables inclusion of
important units.
ii) Judgment stratification of population helps in obtaining a
more representative sample in case research study wants
to look into unknown traits of the population.
iii) Judgment sampling is a practical method to arrive at
some solution to everyday business problems.
Limitations:
i) The judgment sampling involves the risk that the
researcher may establish conclusions by including
those items in the sample which conform to his
preconceived ideas.
ii) There is no objective way of evaluating the
reliability of sample results.
CONVENIENCE SAMPLING
Convenience sampling is commonly known as unsystematic,
accidental or opportunistic sampling. According to this
procedure a sample is selected according to the convenience of
the investigator. A convenience sampling may be used in the
following cases:
i) When universe is not well defined,
ii) When sampling unit is not clear, and
iii) When complete list of the source is not available.
QUOTA SAMPLING
Quota sampling is a nonrandom form of stratified sampling. It
usually consists of three steps:
a) Classification of the population into various types, in
terms of properties known or assumed to be pertinent to
the characteristics being researched.
b) Determination of the proportion of the universe falling
into each type on the basis of the known or estimated
composition of the universe, and
c) Fixing of quotas for each interviewer or investigator who
has the responsibility of selecting respondents so that the
total sample interviewed contains the proportion of each
stratum.
MERITS OF QUOTA SAMPLING
i) Reduces cost of preparing sample and field work, since
ultimate units can be selected so that they are close
together.
ii) Introduces some stratification effect.
DEMERITS OF QUOTA SAMPLING
i) Introduces bias of investigator is not involved at any
stage, the errors of the method cannot be estimated by
statistical procedures.
ii) Since random sampling is not involved at any stage, the
errors of the method cannot be estimated by statistical
procedures.
Quota sampling is most commonly used in marketing survey
and election polls.
Activity 7.10
List the 2 situations where you would use (a) Convenience
Sampling (b) Judgmental Sampling and (c) Quota Sampling




RELIABILITY OF THE SAMPLING
As is evident from our discussions so far, sample should be
reliable and free from bias. Then only it shall be possible for us
to arrive at dependable results. It means that the size of
sample, its relevance and stability to the problem, its
representative character, its use for the study, etc. are the
factors that determine its reliability. Reliability of samples may
be tested on the following parameters.
1. Size of the sample: The size of the sample very much
determines not only its representativeness but also its utility
for study. The researcher must test that the size is adequate
for scientific yet convenient study of the problem.
2. Representativeness of the sample: The
representativeness of the sample should also be tested. It
means that the sample selected should be representative
and possess the characteristics of other units.
3. Parallel sampling: It means that apart from the sample
that have been drawn, another sample may be drawn from
the same universe for testing. On the basis of these tests,
the reliability of the sample, initially selected may be tested.
4. Homogeneity of the samples: Samples, in order to be
useful for study, should be homogeneous. This again means
that they should possess all the characteristics that are
present in the universe.
5. Unbiased selection: The selection of the sample should be
done through a method which is free from bias and
prejudices. For this, the researcher has to take the
precautions.
SIZE OF SAMPLE
For proper study of the problem, it is necessary to have proper
sampling. It means that the sample should be of proper size. If
the sample is either too small or too big, it shall make the study
difficult. What should be the size of the sample, is a question
which should be answered only after taking into account the
various factors of research problem at hand. In this behalf
Parten has laid down that:
An optimum sample in survey is one, which fulfils the
requirements of efficiency representativeness, reliability and
flexibility. The sample should be small enough to avoid
unnecessary expenses and large enough to avoid intolerable
sampling error.
FACTORS TO BE CONSIDERED IN SAMPLE SIZE
The following factors should be considered while deciding the
sample size:
i) The size of the universe: The large the size of the
universe, the bigger should be the sample size.
ii) The resources available: If the resources available
are vast, a large sample size could be taken.
However, in most cases resources constitute a big
constraint on sample size.
iii) The degree of accuracy or precision desired: The
greater the degree of accuracy desired the larger
should be the sample size. However, it does not
necessarily mean that bigger samples always
ensure greater accuracy.
iv) Homogeneity or heterogeneity of the Universe: If
the universe consists of homogeneous units, a
small sample may serve the purpose but if the
universe consists of heterogeneous units, a large
sample may be required.
v) Nature of study: For an intensive and continuous
study a small sample may be suitable. But for
studies which are not likely to be repeated and are
quite extensive in nature, it may be necessary to
take larger sample size.
vi) Method of sampling adopted: The size of samples is
also influenced by the type of sampling plan
adopted. For example, if the sample is a simple
random sample it may necessitate a bigger sample
size. However, in a properly drawn stratified
sampling plan, even a small sample may give
better results.
vii) Nature of respondents: Where it is expected a large
number of respondents will not cooperate and
send back the questionnaires, a larger sample
should be selected.
The above factors have to be properly weighted before
arriving at the sample size. However, the selection of
optimum sample size is not the simple as it might seem
to be.
DETERMINATION OF SAMPLE SIZE
An important question that will arise in any research
study using sampling is.what should be the size of the
sample? In determining the sample size, either one or
both of the following considerations are taken into
account:
1. Cost and time,
2. Permissible error in estimation.
Usually in many sampling surveys, cost and time are
considered in arriving at a suitable sample size. Statistical
methods can be employed in determining the sample size
for probability sampling designs, if the latter
consideration is taken into account. The basic principle
behind these methods can briefly be stated as follows:
Through sampling, we are getting an estimate for a
parameter of the population (or universe). It is rational
on our part to expect that the estimate to be as close as
possible to the value of (unknown) parameter. This factor
is to be considered in determining the actual sample size.
But, the basic question is, that how much close the
estimate should be to the parameter. One ideally wishes
that the estimate should exactly be the same as the
[Type text] Page 286
parameter. This is physically impossible in almost all
cases, since it involves considering the whole population
i.e., the sample is the population itself. Hence in
determining the actual sample size. But, the basic
question is, that how much close the estimate should be
to the parameter. One ideally wishes that the estimate
should exactly be the same as the parameter. This is
physically impossible in almost all cases, since it involves
considering the whole population, i.e., the sample is the
population itself. Hence in determining the sample size,
the researcher should fix up the margin of error, like his
estimate should be within plus or minus 5 units of the
value of the parameter. But the value of parameter is not
known. Hence, one cannot be sure whether the given size
of the sample results into the required amount of
accuracy. However, one can make a probability statement
like: there should be a chance (probability) of 95% that
the estimate should be within plus or minus 5 units of the
parameter.
In some design it is possible to incorporate the cost
aspect also.
In practice, in research studies a number of parameters
are estimated. In such cases, for determining the sample
size, one should consider a key parameter among those
parameters.
A number of formulae have been devised for determining
the sample size depending upon the availability of
information. A formula is given below:
n
Where n = Sample size
Z = Value at a specified level of confidence or
desired degree of precision.
= Standard deviation of the population
d = Difference between population mean and
sample mean.
The steps in computing the sample from the above formula are:
i) Select the desired degree of precision, i.e.,
specified level of confidence and designate it as
small z (at 1% level of significance or 99%
confidence level the value of z is 2.58, and at 5%
level of significance or 95% confidence level 1.96).
ii) Multiply the z selected in step 1 by the standard
deviation of the universe which may be assumed.
iii) Divide the product of the proceeding step by the
difference between population and same mean.
Square the resultant quotient. The result is the size
of sample required.
Solved Example
Determine the sample size if standard deviation of population is
8, population mean = 30, sample mean = 28 and the confidence
level of 99%
Solution
n
= 8, d = (3028) = 2
z = 2.576 (at 1% level the z value is 2.576)
Substituting the values:
n = 106
SAMPLING AND NON SAMPLING ERRORS
To use surveys based on samples it is essentials that one
appreciate the concept of sampling and nonsampling errors.
Sampling errors arises out of the fact that inferences for the
entire population is drawn on the basis of few sample
observations. On the other hand nonsampling errors happens
due to errors of computation at the sate of classification and
processing of data.
SAMPLING ERRORS
Sampling errors are generally of two types viz biased and un
biased.
1. Biased Errors: The process of selection and estimation
of samples may have some bias which leads to these
errors. For instance if instead of using simple random
sampling, judgment sampling is used in a research survey
some bias is introduced in the result due to judgment of
the researcher in selecting the judgement sample. Such
errors are biased sampling errors.
2. Unbiased Errors: Causes of these errors is due to chance
disagreement between the population units selected in
the sample and those not selected. An error in final result
is due to the fact of difference in the unit.
Causes of Bias: Bias may arise due to following reasons:
a) Faulty selection of sample
b) Substitution
All of the above factors can lead to bias in the representative
nature of the sample.
Bias can be avoided naturally by eliminating the above sources
of bias. The easiest and definite way of avoiding bias in the
sampling selection process is by selecting the sample absolutely
randomly.
Generally sampling errors are reduced by increasing the sample
size.
NONSAMPLING ERRORS
Nonsampling errors typically arise due to following factors:
a) Inadequate and inconsistent data specification.
b) Faulty method of interviews and observations.
c) Lack of experience and training of investigators.
d) Errors in data processing operations.
e) Errors in tabulation and classification of data.
The above list is not comprehensive but does give indication of
possible sources of errors.
Nonsampling errors can generally be reduced by controlling the
factors which were listed above. However, it must be noted that
converse phenomena happens, which is through sampling errors
decrease by increasing sample size, nonsampling errors
increase with increase in sample size. Therefore, size of sample
should be optimum so as to minimize a sum of sampling and
nonsampling errors.
SUMMARY
In this chapter definition of sampling, universe, sample,
sampling frame, sampling design, precision, confidence level,
and sampling distribution have been elaborated.
The differences between population and sample have been
highlighted. The laws of sampling, i.e., law of statistical
regularity and law of inertia of large numbers have been
explained. Theory of sampling and its elements have been
described. Differences between sample and census method of
data collection have been incorporated.
The scope, importance and limitations of sampling methods
have been included. The essentials of a sample have been
highlighted. The probability sample methods including random
sampling, stratified sampling, systematic sampling, cluster
sampling, and nonprobabilistic methods such as judgment,
convenience sampling and quota sampling have been discussed.
Representative character of the sample, its reliability, size and
units have been explained. Principles to be considered in sample
size determination, have been described. Sampling errors
and its types have been elaborated.
REVIEW QUESTIONS
1. Define the term sampling and explain its main elements.
2. What is universe? How it is distinguished from population.
3. Explain the types of universe.
4. Explain the following terms:
a) Sample
b) Sampling frame
c) Sampling design
d) Precision
e) Confidence level
f) Significance level
g) Sampling distribution
5. Distinguish between statistics and parameters.
6. Differentiate between population and sample.
7. Explain the following laws
i) Law of statistical regularity
ii) Law of inertia of large numbers
8. Explain the concept of theory of sampling.
9. Differentiate between census and sample method of data
collection.
10.What are the limitations of census methods?
11.What are the limitations of sampling methods?
12.Explain the main essentials of a sample. What factors to
be considered in selecting the size of sampling?
13.Explain the difference between probabilities and non
probabilistic methods of sampling.
14.Distinguish between:
a) Sample and stratified sampling method
b) Systematic and cluster sampling
c) Judgment and Quota sampling
d) Random sampling and Judgment sampling
15.Write short notes
a) Representative character of sample
b) Reliability of sampling
c) Size of sample
d) Units of sample.
FURTHER READINGS
1. Singh, D. & Chaudhary F.S. (2006). Theory and Analysis
of Sample Survey Designs, New Age International (P)
Limited: New Delhi.
2. Sachdeva, J.K. (2008). Business Research
Methodology,Himalaya Publishing House; New Delhi.
3. Shajahan, S. (2009). Research Methods for Management,
Jaico Publishing House, Delhi; India.
CHAPTER 8 ATTITUDE MEASUREMENT AND
SCALES
Objectives
After reading this chapter learners would be able to:
 Understand the underlying concepts behind the study of
attitude measurement.
 Appreciate the importance and difficulties in attitude
measurement.
 Know about the concept and significance of scale.
 Develop and construct various scales needed in research
situations.
 Appreciate the sources of error while using scaling
techniques and how to minimize them.
Structure
 Attitude Defined
 Importance of the Study of Attitude
 Measurement of Attitude
 Difficulties in Measurement of Attitude
 Concept of Scale
 Significance of Scaling
 Problems in Construction of Scale
 Attitude Scales
 Sources of Error in Measurement of Attitudes Using Scaling
of Techniques
 Summary
 Review Questions
 Further Readings
ATTITUDE DEFINED
Attitude may be defined as a behavior which is having certain
observable set of properties or relative tendency pertaining to
and indicative of more complete analysis.
CHARACTERISTICS OF ATTITUDE
i) Attitude must be capable of being observed: Where
it is not possible to observe an attitude as such there
must exist other observable facts which are pointers of
attitude of a person. There are four aspect of human
behaviour.
a) Overt symbolic, which includes the arts of speaking,
writing and gesturing.
b) Overt nonsymbolic, which includes such directly
significant act as driving a car or closing a door.
c) Covert symbolic or what is commonly designated as
thought.
d) Covert nonsymbolic or what is usually described as
feeling or emotions
ii) Attitude reveals tendency: A person's likely reaction in
case of column hypothesis circumstances is Indicated. It is
preparatory to and indicative of more complete adjustment.
Thus attitude is not the actual behaviour, it is only a
tendency and may differ significantly from the actual
behaviour of a person and only a matter of probability and
not necessarily the actuality. It merely signifies how a person
is most likely to behave although in most cases the two are
by and large the same
Activity 8.1
If a sales persons is yawning or busy talking on his mobile which
aspect of human behavior he is depicting to you.






IMPORTANCE OF THE STUDY OF ATTITUDE
The study of attitudes of individual is of very significant
importance in business research studies. Following are the main
advantages derived from the study of attitudes
1. Knowledge of Attitude helps in business control: A
controlled and planned business development is aimed at,
only when the knowledge of the attitude of the persons is
known. It may be pointed out here that most of the
business environment changes are only voluntary in nature
and no amount of legal or (statutory) measure can make
people to behave in particular way. The famous saying 
you can take the horse to pond but you can not make him
drink' holds true in this context.
The failure of controlled industrial regime had been because
of total disregard of the attitude of the people. The
knowledge of the response mechanism is essential for the
successful implementation of all economic development
plans '. According to Kotler, marketers desire to know how
the consumer behavior are built up and and what
environmental fresher educational and social  are
needed to bring to bear upon individuals and groups of
individuals to build up the desired response.
2. Study of Attitude helps in formulation of business
ideas: For successful classification of customers according
to types its study is essential. Since in all business
research studies 'the concept of ideal customer type holds
a great significance, the basis of grouping is the attitude of
a customer towards certain aspects in life, e. g.,
conservative and radical. A customer may be classified
conservative only on the basis of attitude regarding new
products. Similar other types are also based upon the
specific attitude of the people. This helps an formulation of
new business ideas.
3. Knowledge of Attitude facilitates market surveys: For
successful conduct of market customers surveys framing
the questionnaire of schedule the attitude of the customers
replying to it must be kept in mind. It is cautioned that
such questions as are to create offending influence have to
be avoided cautiously. The investigator must know the
weak points of the respondents. He must know when the
respondents are likely to give a false reply, under what
circumstances they are to be most noncooperating and
how can their cooperation be achieved. This all necessitates
an exhaustive study and clear understanding of the attitude
of the people.
4. It helps in business forecasting: Knowledge of attitudes
facilitates predictability as an analyst may result in
generalizations regarding the attitude of people in a given
set of events and their attitudes may be predicted when
the circumstances are not predicted when the
circumstances are known. This quality of predictability is
very useful in demand forecasting. In fact, it is the basis of
all business research studies. In case of attitudes it may be
said that attitudes are generally not stable. At one time we
may have a great liking for a person; at another time may
not like to see his face. A man may be a socialist in his
youth and may turn to be a capitalist in his old age'.
Therefore, findings are to be looked into with this view in
mind.
5. It helps in maintaining economic order: For peaceful
living in every sector of life we have to know the attitude of
others, likes and dislikes, emotional reactions and
behaviour pattern in different circumstances. An
indifference to these may result into clash. The same thing
applies to business houses and customers. The lack of
understanding of each other's attitude has resulted in much
consumer activism. Therefore, for economic growth perfect
understanding of each other's attitude is essential, leading
to economic development of the nation.
Activity 8.2
In a attitude survey of customer regarding introduction of new
smart phones, how would a market survey company capture the
easeofuse idea? Comment.






MEASUREMENT OF ATTITUDES
The nature and characteristics of attitudes reveal that it is
rather difficult to measure.
For instance, the emotions and feelings involved in attitudes
cannot be measured. Thus attitudes can be measured only
indirectly by approximately applying guesswork about them.
Attitudes being internal systems, do not possibly admit of any
measurement, but because the individual's external behaviour is
produced by inner tendencies and so attitudes can be measured
from external behaviour.
It is essential that for measuring attitudes, the immediate
behaviour must be controlled. Thus the degree of reality which
can be obtained in the measurement of external behaviour
cannot be expected in the measurement of attitude, which in
addition, do not offer the same degree of reliability. The
accuracy of the measurement of attitude can be judged by its
capacity of prediction.
Activity 8.3
How will be customer attitude towards emotional or sensory
issues say music or film can be measured? Discuss some of
your ideas.






DIFFICULTIES IN MEASUREMENT
OFATTITUDES
The following difficulties are encountered in measurement of
attitudes:
1. Attitudes are intangible: The attitudes are intangible
and thus not subject to visual observation. As Lundberg
remarks, for our purpose it is enough to point out that
guessing what other people think and feel is daily pursuit
of almost everybody and that some are better guessers
than others. We are interested in discovering and
improving still further the technique of good guessing.
But intangibility is a relative term and changes with our
knowledge of the phenomena and thus what is intangible
and immeasurable today may become perfectly tangible
and may be measured precisely.
2. Attitudinal behaviour is complex: The attitude of a
person or a group is complex and is subject to multiple
influences, which are so large that it is very difficult to
account for all of them. This is the reason why we cannot
say with certainty how a person will react. At best we can
only say that other things remaining unchanged a person
or a group is most likely to behave in a particular way,
but in actual practice these other things hardly ever
remain unchanged and as such the predictions about the
attitude have little practical validity. However, this
assertion has been challenged by Thurston who opines
that for the problem of measurement this statement is
analogous to the observation that an ordinary table is a
complex affair which cannot be wholly described by any
single numerical index. So is a man such complexity
which can not be wholly represented by a single index.
Nevertheless, we do not hesitate to say that we measure
the table we say without hesitation that we measure a
man when we take some anthropometric measurement of
him.
3. Absence of proper scale: Lack of a well accepted and
proper measuring 'attitude' is yet another problem,
whereas in case of physical sciences we have the proper
instruments for the purpose, for example we have the
thermometer that will give us exact measurement of
heat, the barometer would tell us exactly the pressure of
the air and so on. In case of business and economic
phenomena no such instruments have been devised
which can infallibly give a correct measurement of
consumer attitudes. We know that a certain person like a
particular programme but what is the degree of his liking
we can hardly say. The scaling techniques devised and
adopted for this purpose are yet to be perfected and
standardized to the extent of the validity as the
mechanical instruments.
4. Dynamic nature of attitudes: The attitudes are never
fixed because of multiplicity of factors influencing the
attitude. The attitude of the same person may change
with the changes in time and circumstances, making it
difficult to lay down definite rules governing the attitude
Activity 8.4
Suppose you were a restaurant Manager and you were to
develop a scale for spiciness of food to be displayed against
each dish in your menu. How would you develop such a scale?






CONCEPT OF SCALE
The word scale or scaling is generally used for measuring
something. It is in fact a device through which we measure
various things. It is easy to apply scales in the field of physical
science for measurement of physical phenomena. For example,
for measuring the fluctuations of the weather, we use
barometer. Thermometer is used for measurement of heat.
Tapes, meters and other yard sticks containing millimeters,
centimeters, meters, etc. are used. It means that different
measurement or devices are available for measuring different
aspects of physical phenomena but measurement of business
and economic phenomena is not an easy task. Many aspects of
industrial activity are not apparently discernible. It is not
possible to measure emotions, attitudes, faiths, values etc.,
through the devices that are used for measuring the physical
data. In business activity, there are two types of variables or
factors that are responsible for change of environment. Some of
these factors or variables can be measured through scales that
are used for measuring various aspects of physical phenomena.
For measuring them, they are converted into quantitative data
and after that they are subjected to scaling or measurement.
Even this process cannot be applied to various aspects of
business activity and variable at play. For example, it is possible
to measure the number of males and females in a social strata,
paying capacity of different members of society, age or income
of certain members of the society, area, space etc. On the other
hand, it is not possible to measure social status, standard of
living and social attitudes etc. In spite of it, scaling or
measurement is very important for business research. Through
measurement greater accuracy and precision is possible and this
is very much needed in business research as well. That is why
scaling or measurement is quite important in the field of
business and economic research.
Scaling defined: In the field of business research,
measurement or scaling implies in conversion of the
characteristics or qualitative data into quantitative data. After
this conversion the scaling is done. This has to be done because
qualitative data or measurement are mostly subjective and
differ from investigator to investigator. Unless it is converted
into quantitative data, the measurement would not be possible.
Various kinds of statistical measurements are used for
conversion of qualitative one and measuring various aspects of
business and industrial activity.
Activity 8.5
Try to develop scales for the following:
Standard of Living




Social Attitude towards Alcoholism




Moral Attitude towards Permissiveness




SIGNIFICANCE OF SCALING
In business research studies, scales of measurement are very
much important to make the studies exact and scientific.
Though it is difficult to develop scales of measurement in
business and economics, it does not make their utility less in
any way. The business and industrial activity is complex,
abstract and qualitative. While this makes measurement
difficult, it increases the utility and importance of measurement
scales. For a scientific study of business problems the
importance of scales of measurement will be clear by following.
1. To Make Business Research Study Scientific: Pointing
out the utility of scales of measurement in socioeconomic
investigation, W.J. Goode and P.K. Hatt have said, All
sciences move in the direction of greater precision. This
makes many forms, but one fundamental form is
measuring gradation. Thus business phenomena can be
measured by means of socioeconomic scales. Such
scales of measurement unravel new sources of business
information.
2. Objective Measurement: Measurement techniques help
in maintaining objectivity of study. In business research
as more and more scales of measurement are being
developed, objectivity of study is increasing. Even if
several factors contribute in hindering such objectivity, it
has been achieved to such a limit that useful prediction
has been possible.
Types of Scales: There are two types of scales.
a) Concerning behavior and personality These scales
include:
i) Attitude Scales
ii) Moral Scales
iii) Characteristic Scales
iv) Social participation Scales
v) Psychoneurotic Inventories
b) Concerning socioeconomic environment: These are
another types of scales which are used to measure certain other
aspects of the socioeconomics environment. Some of these
scales are used to study socio economic status, communities,
housing conditions and institutional framework.
BASIS FOR SCALE CLASSIFICATION
The scaling procedures may be broadly classified on one or
more of the following basis:
a) Scaling based on Subject Orientation: A scale may be
designed to measure the attributes of the respondent through
stimulus presented to him. In such a scale the respondent's
characteristic has been measured depending on his orientation
to subject stimulus.
b) Scaling based on Response Form: The scaling procedure
may be classified on the basis of response form whether
categorical and comparative. Categorical scales are nothing but
rating scale. Such scales are used when a respondent rates
some object without direct reference to other objects. Under the
comparative scales respondent is asked to compare several
objects and respondent expected to rank that one object is
[Type text] Page 306
superior to the other. Therefore, this scale is also known as
ranking scale.
c) Scaling based on Subjectivity: There may a situation
when the scale data is based on subjective personal
preferences. Or simply make non preference judgments. For
example, in former case, the respondent is asked to choose
which food he prefers or which method he would like to be
implemented, whereas in the latter case, respondent is simply
asked judge which for is more delicious or which method will be
economical.
d) Scaling based on properties: If scale has design
considering the mathematical properties, one may classify the
scale as nominal, ordinal, interval and ratio scales. These scales
are discussed in detail:
1) Nominal Scale: A system in which numerical values are
assigned to object in order to label them is called nominal scale.
One ubiquitous example of nominal scale is the numbers on the
jersey of football players which helps spectators to identify
them. Such a scale is a very convenient method to keep a track
of objects, events and people. However, by using such a scale,
not much statistical analysis can be achieved. All one can do is
to use mode as a measure of central tendency. There is no
generic measure of dispersion in case of nominal scales. Chi
square test is the most common test of statistical significance
that can be utilized. For the measurement of correlation the
contingency coefficient can be calculated.
Advantages
i) It is a simple scaling technique.
ii) It describes differences between objects as assigning
them to categories
iii) iii) They are widely used in surveys when data are being
classified by major grouping.
Limitations
i) It has least powerful measure of scaled date
ii) It can not be statistically analyzed
iii) It has limited use in research.
ii) Ordinal Scale: The ordinal scale, as the name suggests,
attempts to place objects or events in a particular order. In this
scale, there is no attempt is made to have equal intervals of the
scale. One popular example of such scale of this is the rank
ordering utilized in qualitative business research. In fact, an
ordinal scale can be depicted as a statement of greater than (>)
or less than (<) without the ability to show how much greater or
lesser. In the ordinal scale, equality (=) statement is also a
possibility. In so far statistical measures are concerned, the
measure of central tendency used with the ordinal scale is
median. Dispersion can be measured by using either the
percentile or quartile method. For measurement of statistical
significance, non paramatric method is utilized.
Major limitation of ordinal scale is that they only permit the
ranking of objects from lowest to highest. It must be noted that
ordinal scales have no absolute value, and therefore absolute
difference between adjacent ranks may not be equal.
iii) Interval Scale: In development of interval scale, the
intervals are adjusted in terms of some formula that has been
evolved as a basis for making the intervals equal. The
centigrade temperature scale is an example of an interval scale
and can be used as an example of possibilities it provides. It can
be said that an increase in temperature from 80
0
c to 90
0
C is
same increase in temperature as an increase from 30
0
c to 40
0
c.
However it can not be said that the temperature of 80
0
c is twice
as hot as the temperature of 40
0
c because both numbers are
dependent on fact that the zero on the scale is arbitrary and
fixed as the temperature of the freezing point of water.
Interval scales provide more powerful measurement than
ordinal scale because in interval scale the concept of equality of
interval is inherent.
Mean is the appropriate measure of central tendency, while
standard deviation is the most widely used measure of
dispersion. Tests for statistical significance used in case of
interval scales are the test and of F test. Correlation
techniques utilized are product moments.
iv) Ratio Scale: Ratio scales have an absolute or true zero
of measurement. For example, the zero point on an inchtape
indicates the complete absence of length or height.
In fact, ratio scale represents the actual quantum of variables.
Scales evolved for measuring physical dimensions such as
weight, height distance etc. are the examples. Large number of
quantitative ratio scales techniques are used to manipulate ratio
scales. In fact all mathematical operations possible with real
numbers are possible with ratio scales. In case of ratio scales
measures of central tendency is computed by calculating
geometric and harmonic means.
With ratio scales it is possible to make statement such as,
Rakeshs athletic prowess is twice as good as that of Mohans.
Here the ratio involved thus facilitate a comparison which was
not passable with Interval scales. Thus it can be concluded that
starting from normal scale and progressing to ratio scales the
quantum of information increases progressively.
e) Scaling based on number of dimensions: The scales may be
unidimensional or multidimensional scales. Under
unidimensional scale we measure only one attribute of the
respondent whereas in later the object might be better
described taking measurement of his attributes from multiple
dimension.
STAGES OF DEVELOPMENT OF SCALES
Development process of measurement scales involves following
stages:
a) Concept Development: Concept development has
emerged from techniques developed by business
research. The first step in concept development is that a
researcher must very clearly understand the major issues
pertaining to the business research problem at hand. The
stage of concept development is more prevalent in basic
research than in applied research. This is because; in
basic research one is seeking new discoveries where the
conceptual framework is still fuzzy.
b) Specification of Concept Dimensions: In the second
stage, the researcher needs to outline the various
dimensions of the concept that was developed earlier. The
dimension specification can be achieved by any of the
following three techniques:
i) Deduction
ii) Intuitive Approach
iii) Empirical Correlation
For example, if a new product is being launched, the
dimensions can be product quality, brand value,
customers acceptance, price leadership, concern for
environment etc.
c) Selection of Indicators: Indicator may imply several
specific ways in which a particular respondents
knowledge, advice, experience, opinion can be measured.
This may include questions, scales and other such
methods, which gives a measurable output. If more than
one indicator is used, it provides better validity to the
data collected.
d) Formation of Index: The final step is how several
indicators can be combined to develop an index. This is
particularly useful if a specific concept has multiple
dimensions which have different measurement indicators.
Therefore, researcher may have a need to combine them
into one single index. One such example often used by
economists is known as WPI (Wholesale Price Index).
SCALE CONSTRUCTION
The five main techniques by which scales can be developed are
as follows.
i) Arbitrary Technique: In this case, as the name
suggests, a scale is developed on ad hoc basis. This is
because there is a thinking that such scales
appropriately measure the concept for which they
have been developed. Even though, this is the most
often used method of scale construction, the scientific
evidence to support such a scale is rather scanty.
ii) Consensus Technique: In this technique, an expert
panel is asked to select and evaluate items that should
be included of in scale design. This is based on their
judgment regarding whether such items are
importance to the research area and have direct
bearings on the research outcome.
iii) Item Analysis Technique: In this method, which has
more scientific approach, various individual items that
can be chosen for inclusion in the scale, are put
together in a form of test questionnaire, which is then
administered the group of respondents. Subsequent to
the test, the various individual items are evaluated on
their discrimination value. Generally, the items which
score high discrimination factor are utilized for scale
design.
iv) Cumulative Technique: In this technique, scale is
developed depending upon the basis of conformance
to a ranking order. This ranking order can be based
either on ascending or descending discrimination
potential. Cumulative effect of items depending on the
research concept then gives rise to an appropriate
scale.
v) Factor Technique: Factor scales are designed on basis
of correlations of items which indicate a common
factor for the relationship between items.
PROBLEMS IN CONSTRUCTION OF SCALES
The scales of measurement are developed for business research
in line of techniques provided in earlier subsections. These,
however, face certain problems, important of which are as
follows:
1. Determination of Sequence: It is not possible to
measure every phenomena particularly in the field of
business. Only those phenomena may be measured which
have a sequence. If there is no sequence in a particular
phenomena, they can not be measured since they remain
scattered and unconnected. In physical events the
sequence or continuity may be easily noticed. The
formation of scales of measurement concerning physical
properties, is therefore rather easy. For example weight,
temperature, length and breadth and other physical
properties vary from the minimum to the maximum limit
with a sequence in between. In the industry, at times it is
difficult to find out such a sequence. For example, the
amount of preference for a particular brand can not be
measured. However, the business researcher should try
to determine the minimum and maximum limits of his
brand preference so that some scales of measurement
may be developed.
2. Determinational Validity: Validity means the extent to
which an instrument measures what it proposes to
measure. For example, if an investigator wants to
measure purchasing power of a person, only that
instrument will be valid which may measure it correctly.
In the words of Selltiz, Jahoda and others, A measuring
procedure is valid to the extent to which scores reflect
true differences among individuals, groups or situations in
the characteristics it seeks to measure
There is no direct way of determining validity of measurement
because it is difficult to locate the exact position of an individual
on the scale of characteristic that is being measured. In the
absence of this information, the validity of an instrument is
judged by the compatibility of its results with other relevant
evidence. Relevant evidence means evidence about the truth of
conclusions of measurement. It depends on the nature and
purpose of the measuring instruments. The purpose of a scale is
to provide basic terms of the accuracy of predictions made on
the basis of scales of measurements. It is known as pragmatic
validity.
On the other hand, the validity for judging whether the
instrument measures the concept it is intended to measure is
known as construct validity.
i) Pragmatic Validity: In pragmatic validity, the usefulness
of the scale in prediction is judged. The investigator is
not interested in the performance of the individual but
in its indication of a certain individual trait. There
should be a reasonably valid and reliable criterion with
which the scores on the scale may be compared.
Normally, the nature of techniques will determine the
relevance of criteria. The criterion with which the
scores on the scale are compared should itself be valid
and reliable. The reliability and validity of the criteria
may be improved upon by carefully defining its various
dimensions.
ii) Construct Validity: Constructs are such measures of
intelligence, scales of attitudes, scales of
modernization, measurement of group morale etc.
This definition of such constructs consist in part of sets
of propositions about their relationship to other
variables. Thus, in order to know construct validity it
should be determined whether the measurement
attained by using a scale are consistent with the
prediction. The predictions in connection with
construct validity are of a different nature and survey
different functions from those involved in determining
pragmatic validity. The examination of construct
validity involves validation of a scale and also of the
theory underlying it. According to Campbell and Fiske,
The investigation of construct validity can be made
more rigorous by increased attention to the adequacy
of the measure of the construct in question, before its
relationships to other variables are considered. The
two kinds of evidence about a measure are needed
before one is really justified in examining relationship
with other variables:
a. Evidence that different measures of the construct
yield similar results; and
b. The scale thus constructed can be differentiated
from other constructs.
Thus construct validity can not be tested by a single procedure.
It requires evidence from a number of sources such as
correlation with other scales, internal consistency of items,
stability of pattern, overtime etc. Estimates of pragmatic validity
may be useful in the evaluation of construct validity. If a scale
of measurement shows pragmatic validity, its construct validity
may be explored. Both these types of validity are required to be
determined for sufficient validation of a scale of measurement.
W.J. Hatt and P.K.Hatt have pointed out the following four bases
for the investigation of validity of scale.
i) Logical Basis: If the conclusions reached through a scale
are illogical, its validity may be reasonably doubted.
Valid scales always provide logical conclusions. For
example, if one the basis of a scale the children of an
organized family are declared to be more criminal than
that of a broken family, this conclusion being illogical,
the scale will be declared invalid. It is a common
knowledge that the children of a broken family are
more criminal than that of an organized family. Logical
basis, however, is variable and sometimes even
subjective. Therefore, this method is only tentative
and not definite and scientific.
ii) Public Opinion: The opinion of knowledgeable persons is
also criterion of validity of the conclusions reached by
a scale. If the conclusions are contradicted by public
opinion, the scale may be declared as invalid. This
criterion, however, is far from being scientific. Since in
so many cases conclusions against public opinion and
even the opinion of knowledgeable persons, have
sometimes proved to be correct in the long run. It
may be remembered here that public opinion should
here employ the opinion of knowledgeable persons.
iii) By Application upon a Known Group: The criterion of
validity is almost the best. In this method, the scale in
question is utilized to measure a known individual or
group. If the scale shows already known conclusions,
it is declared valid. For example, if we know that a
particular individual A is an advocate of love marriage
while another individual B is against them, the scale
which shows their attitudes accordingly, is a valid
scale. On the other hand, a scale whose conclusions
contradict already known facts may be declared as
invalid.
iv) Independent Testing: Another method of determining
validity is known as independent testing. In this
method instead of testing a total phenomena through
a scale, its different aspects are measured separately.
If all these conclusions are of the same type and
coherent, the scale may be declared valid. For
example, religiosity leads to several known attitudes
which may be measured separately, by a particular
scale. If results are consistent and coherent the scale
is valid. This method also suffers from severe
limitations. In fact, the different aspects of a
phenomena do not have equal value nor are they
always coherent or consistent to one another.
Therefore, the criterion of independent testing is not
very reliable.
ATTITUDE SCALES
Attitude scales evolve out of a series of short but carefully
formulated statements of propositions dealing with selected
aspects of issues, institutions or groups or people under
investigation.
In his life, an individual reacts verbally with approval or
disapproval, to the items on the scale. These reactions quantify
his position on issues under question. Various attitude
measuring scales are discussed in the following section.
ATTITUDE MEASURING SCALES
Given below are various scales for measuring attitudes:
a. Opinion Scales: Opinion scales are based on opinions
as the basis of attitudes. A persons attitudes towards
some specific person, subject matter or object can be
known by analysis of his opinions concerning them. On
this premise, psychologists have evolved scales for
defining the opinions of individuals relating to various
objects, problems and persons. These scales differ in
respect of the method of their construction and kind. But
the aim of every scale is to locate and determine the
position of an individual upon a measure extending from
one extreme of acceptance to the other extreme of
refusal. The opinion scales reveal the reactions of the
individual towards some particular things, and from these
reactions his attitude can be deduced. It is advised that,
the scale should be so formulated as to reveal to the
maximum possible level the attitudes of the individual.
This requires a selection of statements which can give
expression to the concurrence or otherwise of the attitude
of the person relating to a person, matter or object.
Opinion Scales are many but the important among them
are following:
1. Thurstone Scales
2. Likert Scales
3. Guttman Scales
In addition to these, other relevant opinion scales are: the
Bogardus Scales, Rank Order Scale and Paired Comparison
Scale.
1. Thurstone Scale: Between 1919 and 1931, these scales
were constructed by Thurstone and his colleagues for the
measurement of opinions and beliefs of groups
concerning varied questions such as war, church,
negroes, capital punishment, birth control, etc., using
consensus scale construction approach.
Constitution of the Scale: Thurstone Scale constituted
of the items mentioned below:
1. Collection of numerous sample opinions related to the
question or subject presented.
2. Determination of the value of the scale for these
opinions by some definite determinant. This evaluation
leads to the elimination of controversial statements,
leaving only those statements concerning which all
specialists are unanimous. Opinions extend between a
negative and a positive limit.
3. Determination of the median point for every statement
according to the opinion of the specialist judges.
4. Finally, it is necessary to see whether the questions
contained in the scale are in a definite order or not.
The order of statement should be such that they
proceed from maximum to minimum acceptance. For
this, in the beginning of the scale these questions are
arranged concerning which all specialists are agreed
while in the end of questions concerning which all
specialists disagreed are arranged.
Precautions in the Construction of the Scale: The
following precautions are observed:
1. It should be ensured that every statement of the scale
should be distinctly worded.
2. It must be seen that the number of determining
specialists is large enough to indicate clearly the
position which would be given to the opinion
expressed by the subject.
3. Only of those statements about which the judges are
unanimous should be given place in the scale. If a
particular statement is placed in the same class by
each of the specialists, it should be definitely chosen.
But, if the statement is placed in a different class by
each of the specialists, it is rejected being
controversial. In other words, opinion having
unanimity, must be included in the scale.
Application: The Thurstone method has wide application
in measuring attitudes. It is, therefore, used for
developing differential scale which can measure attitudes
towards intangible things such as war, religion etc. Scales
developed using this technique are considered as the
most appropriate and reliable method if used to measure
a single attitude.
Limitations
i) It is a costly method and requires large effort to
develop them.
ii) The method is not completely objective, it involves
ultimately subjective decision process.
iii) It provides less information about the respondents
attitude in comparison to differential scales.
2. Likert Scale: Likert, another sociologist, constructed a
scale in 1932 which differed from Thurstones scale. This
scale aimed at locating the attitudes of various human
groups relating to imperialism, internationalism and
Negroes. This scale is also called Summated Scales:
Constitution of the Likerts Scale: The following items
are included in the Likert Scale:
1. To construct many statements related to the object or
problem the attitudes towards which are to be studied.
2. To Show these statements to the subject and to get
them classified into the following groups:
Strongly Approve, Approve, Undecided, Disapprove,
Strongly Disapprove.
3. To award points to the above classification in the
following manner:
5 4 3 2 1
4. To find the correlation between the total score of the
subject and the scores of the statements individually, and
finally.
5. To exclude those statements which bear a negligible
correlation to the score.
Advantage: The Likert scale have following advantage:
i) It is easy and simple to construct
ii) It is more reliable and provides more information
iii) It can easily be used in respondent centred and stimulus
centred studies.
iv) It takes less time in construction
Limitations: The Limitations of Likert Scale are as follows:
i) It does not give the intensity comparison of responses.
ii) This is an ordinal scale and not the intervals scale.
iii) It is noticed often that the total score of a respondent has
little clear meaning since a given score can be scored
by a variety of answer patterns.
3. Guttman Scale or Cumulative Scale: Guttman constructed
a scale to measure and study the level of morale in American
soldiers in 1941. This scale is comprised of following main
elements:
1. To determine whether any statement can be shown
upon the scale or not.
2. To prepare scalogram to test the consistency of any
statement.
3. To vary the questions concerning the same problem in
such a manner as to determine that the opinion of the
subject is consistent.
Advantage: The advantages of Guttman scale are as follows:
i) It measures only a singly dimension of attitude
being measured.
ii) Researchers subjective judgment is not allowed to
creep in the development of scale since the scale is
determined by the replies of respondents.
iii) Small number of items are required to make the
scale.
Limitations: The Guttman Scale have following limitations:
1. The procedure is tedious and complex
2. It is not very much reliable
3. It is not very objective scale
4. It is difficult to compare to other scaling methods.
B. Factors Scales: Factor scales are developed through factor
analysis or on the basis of intercorrelations of items which
indicate that a common factor accounts for the relationship
between items. According to C.W. Emory, Factor scales are
particularly useful in uncovering latent attitude dimesions and
approach scaling through the concept of multiple dimension
attribute space.
C. Multidimensional Scale: Multidimensional scaling can be
characterized as a set of procedures for portraying perceptual or
affective dimensions of substantive interest. It provides useful
methodology for portraying subjective judgments of diverse
kinds. It is used when all the variables in a study are to be
analyzed simultaneously and all such variables happen to be
independent. Through this technique, one can represent
geometrically the locations and interrelationships among a set of
points.
Advantages: The significance of multidimensional scaling lies in
the fact that it enables the researcher to sudy the perceptual
structure of a set of stimuli and the cognitive processes
underlying the development of this structure. Psychologists, for
example, employ multidimensional scaling techniques in an
effort to scale psychological stimuli and to determine
appropriate lables for the dimensions along which these stimuli
vary.
Limitations: It is complicated in computation.
D. Rating Scales: Besides the opinion scales, another method
is that of rating scales. The striking feature of the rating scales
lies in the fact that here the attitudes are evaluated not on the
basis of the opinions of the subjects, but on the basis of the
opinions and judgment of the experimenter himself. Thus in
rating scales discretion of the experimenter comes into the
picture of the measurement of attitudes.
Through the following means, the experimenter collects the data
in the Rating Scales.
1. Nonverbal behavior
2. Verbal Behavavior
3. Secondary Expressive Cues
4. Clinical Type Interview
5. Personal Documents
6. Projection Techniques
7. Immediate Experience
The sources noted above provide the experimenter enough data
concerning the attitude. Compared to speechless behavior, the
easing method of understanding attitude is that of conversation.
In addition to conversation, various facial expressions and
fluctuations in the volume of sound are also good indicators of
attitude. According to Kietch and Cruchfield, it is a common
place belief that what a man says may be less revealing than
how he says it. Thus, the manner of expressing a response
may become more informative than the verbal expression itself.
On the other hand, in a clinical interview the subject can be
devised to respond to questions, from which he would normally
hesitate, but he does so now under the belief that the answers
are necessary for his treatment. Diary, essays, letters, poetry,
story and other kinds of individual writings manifest the attitude
of the individual. Among them, diary is of the greatest
importance since it is the most personal. An autobiography also
reveals the attitudes of the author. Project techniques have also
proved very useful in revealing attitudes.
Types of Rating Scales: In general, rating scales are put
under two classes: (i) Relative, and (ii) Absolute. Both of these
scales are being briefly discussed below:
i) Relative Scale: The experimenter in this scale gives to
individuals attitude, a placing in the scale extending from
the highest to the lowest quantity. In studying the individual
in the circumstances of the human group, the position of the
individual relative to the positions of others, is considered.
Thus, in this scale method the individual is allotted a relative
position on a similar scale, which is his placing in comparison
with other individuals.
ii) Absolute Scales: This scale provides absolute to an
individual in the population. Here a particular part of
population is examined and then the opinions of an individual
are analyzed. The investigator than places the opinions of
individual on the scale, showing the percentage of the
population concurring with his opinions as well as the
percentage differing from him. This also gives the absolute
position of his opinions.
SOURCES OF ERRORS IN MEASUREMENT OF
ALTITUDES USING SCALING TECHNIQUES
The following are the possible sources of error in measurement.
a) Respondent: Many a tames the respondent may not be
willing to express strong negative feelings At other times he
may have insufficient knowledge but may not admit his
ignorance. These may lead to guessestimates of scale factors.
Extraneous factors like fatigue, boredom may also affect
correctness of response from respondent.
b) Situational: Situational factors may way measurement. Any
conditions, place or timing that can have serious effect on
correctness of results may place a strain on Interview.
'therefore all situational factors be properly considered, before
scaling Instrument is utilized to elicit response also come in the
of correct Computational: Errors may also creep in because of
incorrect coding of scales, faulty tabulation, and/or statistical
calculations, particularly when the data is analyzed.
d) Instrument: Errors may also arise because of the defective
measuring instrument. The use of esoteric words, poor printing,
response choice omissions etc.. in oblong celled data may also
dead to error In measurement.
SUMMARY
In this chapter, the underlying concepts behind the study of
attitude measurement, the importance and difficulties in attitude
measurement have been elaborated upon. This chapter also
discuss about the concept and significance of scale, how to
develop and construct various scales needed in research
situations. It also covers the sources of error while using scaling
techniques and how to minimize them.
REVIEW QUESTIONS
1. Define the term attitude. State its characteristics 2. How
attitude is measured? State the problems in measuring
attitude.
3. What is the significance of measurement of attitude? 4. What
Is scaling? Explain significance/importance of scaling In
business  research.
5. 'Explain the problems forced in construction of scales? How
they are z overcome? 6. Explain the basis of dislocation of
scales.
7. Discuss the types of scalesconstruction techniques.
8. Explain the types of attitude measuring scales.
BLOCK 3 DATA PRESENTATION AND ANALYSIS
CHAPTER 9 DATA PROCESSING
CHAPTER 10 STATISTICAL ANALYSIS AND
INTERPRETATION OF DATA: NON
PARAMETRIC TESTS
CHAPTER 11 MULTIVARIATE ANALYSIS OF DATA
CHAPTER 12 MODEL BUILDING AND DECISION MAKING
CHAPTER 9 DATA PROCESSING
Objectives
After reading this chapter, the learner would be able to:
 Understand the essential concepts and techniques for
data manipulation
 Know about concepts of data editing and precautions that
should be taken
 Understand how data is codified and classified along with
various tools and methods to do so
 Carry out the exercise of tabulation of data
Structure
 Basics of Data Manipulation
 Data Editing
 Codification of Data
 Classification of Data
 Tabulation of Data
 Summary
 Review Questions
 Further Reading
BASICS OF DATA MANIPULATION
Data Processing: Data processing may be defined as the
process of editing, coding classification and tabulation of
collected data so that it becomes amenable to analysis.
Data Analysis: Data Analysis may be defined as the process of
computation of certain parameters alongwith identification of
relationship patterns that may exist among data groups. In the
process of analysis, relationships may be discovered that may
support or conflict the original hypothesis. This analysis leads to
valid conclusions only if the relationship pattern stands the
statistical test of significance.
ELEMENTS OF DATA PROCESSING
The elements of data processing are as follows:
i) Editing
ii) Coding
iii) Classification
iv) Tabulation
Activity 9.1
Take simple example of data pertaining to monthly shopping for
grocery and routine house hold things. How would you edit,
code, classify and tabulate such data?





DATA EDITING
Editing of data may be defined as process of scrutiny of raw
data collected specially through survey/schedules to detect
errors and omissions and to correct these steps wherever
possible.
Steps involved in Editing: The following steps are involved in
editing:
1. Deciphering : If the handwriting of a data collector is
difficult to read. To make the data readable deciphering is
done by the editor.
2. Checking entries: The editor must also scruitinise the
data of schedule to see if any entry is missing or
incomplete. The editor must also look over schedules for
inconsistencies, e.g., Entries of age and date of birth
must agree.
3. Approximation: This is the most important work the
editor has to perform. Big figures as reported by
enumerator may be complex and difficult to understand,
hence to have data meaningful and useful they must be
approximated. Approximation is the basis of rounding off
the figures with a view to simplify them and to make
them suitable for analysis without impairing the standard
of reasonable accuracy. Many times, it is not necessary to
give actual numbers and approximate figures serve the
[Type text] Page 332
purpose. Thus the process of approximation enables a
clear and easy grasp of figures and facilitates calculation
and comparison. The extent to which approximation
should be done depends upon the degree of accuracy
desired in the data.
BOX 3.A
Table 3.1
True
value
I METHOD II METHOD III METHOD
Approximated
Value
Error Approximated
Value
Error Approximated
Value
Error
5137 5100 37 5200 +63 5100 37
3529 3500 39 3600 +71 3500 29
2178 2100 78 2200 +22 2200 +22
1489 1400 89 1500 +11 1500 +11
4291 4200 91 4300 +09 4300 +09
4302 4300 02 4400 +98 4300 02
20926 20600 326 21200 +274 20900 26
When approximation has been made it is necessary to keep in
mind that in doing mathematical operations of the approximated
figures the error is likely to be increased or decreased according
to the nature of operation.
STAGES IN EDITING
Editing of data can be staggered in two phases, depending
whether it is done in field itself or at a central data collection
office. Accordingly, there stages are known as (a) Field Editing
(b) Central Editing.
In case of field editing basically the reporting forms of are
reviewed by investigators/enumerators themselves, the very
evening of the interview. The review corrects any illegible
writing or spontaneous abbreviations coined and used by the
investigator. However, field editing does not imply that
investigator should try to supply the data by himself, which was
not replied to by the respondent.
Central editing takes place when all forms or schedules have
been completed and deposited at the central office by the
various investigators/enumerators. The advantage inherent in
this type of editing is that only single editor (or atmost a team
of editors in case of large sample size) edits the entire set of
collected data utilizing the techniques of deciphering, checking
entries and approximation. In this editing process editors may
also cleanse the data of obvious wrong replies; this is especially
true in case of mailed surveys.
PRECAUTIONS IN EDITING
Editors while editing the collected data must exercise caution
and restraint. Good editors employ following
procedures/directions.
a. They are well aware of instructions provided to
investigators who collect the data. Also they must be well
versed with direction given to them to edit the data.
b. While correcting a data they should do so in a manner, so
that the original data remains legible for any future
reference, if so needed.
c. Ideally, they should use a different colour ink for any
entries made by them
d. All data changes or missing answers supplied by them
must be initiated.
e. All edited forms or schedules must be datelined and
initiated by the editor.
Activity 9.2
List 5 important skills that a good data editor must have.


CODIFICATION OF DATA
The process whereby numerals or symbols are assigned to
response data in order to organize the data into a limited
number of categories or classes is termed as codification.
Obviously, the categories so created must be relevant to the
research problem being investigated. The classes/categories
created by codification process must have following properties:
a. Exhaustiveness implies there must be a class for every
item of data.
b. Mutually Exclusiveness implies a response can be put in
one and only one cell of a category set.
c. Unidimensionality implies each class is defined in terms
of only a single concept.
d. Coding process is beneficial for efficient analysis of data.
It is through the codification that voluminous data is
reduced to smaller number of classes which present the
critical data and makes it amenable for analysis. In good
research designs, codification methodology is determined
at the state of questionnaire design itself. This enables
creation of a coding sheet to go along with the
questionnaire. This facilitates transcription of data from
questionnaires to these in codification should minimize
coding errors.
Activity 9.3
Suppose you are a manager of a retail store, how would you
codify the data pertaining to womens apparel? Give some
examples.







CLASSIFICATION OF DATA
Classification is the process of arranging data in groups or
classes according to resemblances and similarities. In
classification process data units having similar characteristics
are placed in class and in this manner the entire data is
divided into a number of classes.
OBJECTIVES OF CLASSIFICATION OF DATA
1. Depiction of Homogeneity and Heterogeneity of
data: Classified data helps researcher to discern the
homogeneous or heterogeneous pattern in the collected
data.
2. Ease in Understanding: Unnecessary details are
eliminated through the process of classification, enabling
data to be easily understood.
3. Comparative Analysis: It is very difficult to compare an
unorganized and dissimilar data. Classification enables
the investigator to compare the different sets of response
data.
4. Improve Usability: Classification increases the utility of
collected data. It enables a researcher to evolve possible
solutions to the research problem at hand.
5. Basis of Tabulation: Unless data is classified, tabulation
of data becomes impossible.
Activity 9.4
Suppose you are considering census of major animal in
Ranthmbore Animal Sanctuary. How would you attempt to
classify the data you collected?







Characteristics of a good Classification: Due to the fact that
classification plays a very important role in research process, a
good classification must possess following characteristics:
1. Homogeneity: Classification of data should be
homogeneous, which implies data in a particular class
should be of homogeneous nature.
2. Unambiguous: Classification of data should be simple
and very clear. No ambiguities should be allowed, and if
exists should be removed in classification process.
3. Uniform Basis: Basis of classification of data is one of
the major considerations. Uniform basis of classification
makes the data more scientific and reliable.
4. Goal Oriented: The object of research study is corner
stone of classification. The classified data must be
research goal oriented.
5. Comprehensiveness: Ideal classification of data is of a
comprehensive nature, which means that entire data set
gets classified.
6. Stability: Classification should be stable and it should not
be changed frequently. Unstable classification fails to
deliver reliable results.
Activity 9.5
How classification of data, let us say on sighting of Siberian
Crane in Bharatpur Bird Sanctuary, help in goal orientation of
researcher who wishes to corelate level of water in lake with
migratory pattern?



Basis of the Classification: Data are classified on the basis of
the characteristics of the different groups of data units. These
characteristics express the similarity of attributes which may be
traced to individual units. These characterstics can be of the
following two types, i.e.,
i) Descriptive or qualitative oriented
ii) Numerical or quantitative oriented
The examples of descriptive characteristics are, occupation,
literacy, marital status and gender.
The numerical characteristics are like age, income, weight, etc.
Descriptive characteristics can not be measured in quantity,
where Numerical characteristics can be put into figures.
Data may be classified on the basis of qualitative characteristics,
i.e., attributes. The attributes can not be measured
quantitatively and this classification is called as classification
according to attributes. But when the data are classified on the
basis of quantitative characteristics the classification is termed
as according to classintervals.
1. Classification according to Attributes: When the data
is classified on the basis of qualities or attributes, their
presence and absence, it is called classification on the
basis of attributes. This classification is a descriptive
classification of data and is of following three types:
a) Simple Classification: In this method, the data are
divided on the basis of attributes or qualities. In this
classification data having the attribute are placed in
one class and those not having the attribute in the
other class. This type of classification in which only
one attribute is studied and the data are divided in two
parts is called simple classification or dichotomous
classification.
b) Manifold Classification: When the whole universe of
data is classified into more than one attribute it is
called as manifold classification. For example, if the
problem of illiteracy is studied genderwise, there are
two attributes under study, namely, literacy and
gender. A person can be either literate or not literate;
further a person can be either a male or female. Each
of these two attributes can be divided in two classes.
The data, thus can be divided into four classes:
i) Males who are literate
ii) Males who are illiterate
iii) Females who are literate
iv) Females who are illiterate
A third attribute say religion may be used and data
subdivided again on the basis of religion resulting in more
classes. Such classification where more than one attribute
is taken into account is called multiple classification.
c) Arbitrary Classification: Classification may be
arbitrary, where it is difficult to locate a particular
attribute. In the various groups of data; it is not
always possible to find out natural or very well defined
differences. Therefore, such classification is of an
arbitrary nature. For example, if the female population
of the universe is to be divided into two classes, i.e.,
the beautiful women and ugly women, the decision
about beauty and ugliness is bound to be arbitrary.
It is a matter of opinion only.
2. Classification according to Class Intervals: The
difference in upper limit and lower limit of the size of item
is known as class interval. This type of classification is
attributed only in those cases where the direct
quantitative measurement of data is possible, for
example, production units, bank depots etc.
Given in the Box 3B is an example of the classification
according to class intervals. Here there are fifty students in
class and the marks obtained by them have been classified
into the intervals of 10.
BOX 3B
Marks
(ClassIntervals)
Number of Students
010 2
1020 10
2030 15
3040 13
4050 10
Total 50
Following terms are used in the classification of data
according to class intervals:
a) ClassLimit: The classlimit refers to the highest and
lowest value that can be included in the class. Highest
value is known as upper limit and the lowest value is
called lower limit. For example, if the interval is 4050 the
lower limit is 40 and upper limit is 50.
b) Magnitude of ClassIntervals: The difference between
the upper limit and the lower limit of the classinterval is
known as the magnitude of the classinterval. For
instance in the above example, 10 is the magnitude (i.e.,
3020).
c) MidValue: The average of the upper limit and lower
limit is called as midvalue. Midvalue is the value in the
middle of the classlimits. Midvalue is calculated in the
following manner:
Mid Value =
If classinterval is taken as 8090
Then Mid Value = = =
(Note that 85 is the midpoint between 80 and 90)
d) ClassFrequency: The number of items falling within a
classinterval is called as classfrequency. In the table
given in Box 3B the classfrequency of the class interval
3040 is 13 and of 4050 is 10 etc.
Methods of Classifying Data according to ClassIntervals:
The classification of data according to the classintervals
generally utilize the following methods:
1. Exclusive Method
2. Inclusive Method
These methods are being described as under:
1. Exclusive Method: In this method, as the name
suggests, the items whose values equals the upper limit
of a classinterval are excluded from that interval and are
grouped into next higher classinterval.
In this method of classification, the classintervals is divided
in such a manner that the upper limit of one class is the
lower limit of the class next to it. This is being shown in the
table in Box 3.C.
BOX 3C
Class Intervals in Exclusive Method
(ClassIntervals) Number of Students
010 5
1020 10
2030 20
3040 5
2. Inclusive Method: In inclusive classification the upper
limit is also included in the classinterval. This can be
demonstrated with the help of Box 3D.
BOX 3D
Class Intervals in Exclusive Method
(Class
Intervals)
Marks
Frequency
(Number of
Students)
1019 10  
2029 15   
3039 20    
4049 30     

5059 20    
6069 10  
7079 5 
8089 0
Total 110
Researcher needs to determine four parameters in order to
ensure purposeful classification of data:
1. Number of Classes
2. Magnitude of Classintervals
3. Determination of Classlimit
4. Arrangement of Frequencies
1. Number of Classes : Number of classes depends upon
following factors:
a. Nature of Data
b. Total Frequency
c. Range Items, etc
There are no certain rules about the number of classes. As a
rule of thumb too many and too small a number of classes
should not be preferred.
2. Magnitude of Classintervals: Magnitude of class
interval can be explained with the help of the following
formula:
Magnitude of Class Interval =
Some researchers like to utilize Sturges Formula as presented
below to determine class intervals.
C1=R/(1+3.3 logN) where
C1= Class interval
R = Range = Highest Value Lowest Value
N= Number of items to be classified
3. Determination of Classlimit: Classlimits are selected
in such a way that midvalues of the classes coincide or
come very close to the points of concentration in the
data.
4. Arrangement of Frequencies: Tally sheet is prepared
for counting the items for each class. Tally sheet is
prepared by writing groups on a sheet of paper for each
item, and a tally bar is marked against the group to which
the item belongs. Tally marks are as depicted in the
sample table shown in Box3D.
TABULATION OF DATA
TABULATION IS THE FINAL STAGE IN DATA PROCESSING
Tabulation is the final stage in collection and compilation of
data, and it is the steppingstone to the analysis and
interpretation of data. In deciding about the type of tabulation
one has to keep in mind the nature, scope and the object of
research problem. Tabulation of data should be done in such a
form that it suits the nature and object of the research
investigation.
Tabulation simply means presenting of data through tables. To
be more precise Tabulation is an orderly arrangement of data
in column and rows.
OBJECTIVES, IMPORTANCE AND ADVANTAGES OF TABULATION
Proper tabulation of data is of great importance because if the
tabulation of data is not satisfactory its analysis is bound to be
defective.
Tabulation of data is done in order to achieve simplicity and
convenience in processing and interpretation of data collected
for research problem.
Following are the major advantages of tabulation to the
research methodology:
1. Ease in Understanding: It is easy to understand
tabulated data than the unorganized data.
2. Time Savings: Tabulation of Data leads to immense
saving of time during analysis.
3. Ease in Drawing Diagrams: Diagrammatic
representation of data is more convenient if it is done on
the basis of a tabulated data.
4. Ease in Comparison: Through tabulated data, it
becomes easy to undertake comparative study because it
is systematically displayed.
5. Detection of Errors: Errors and omissions in data are
easily detected in tabulated data.
6. SpaceSaving: In tabulation, the data is displayed in
colums and rows and so it uses less space.
7. No Chances of Repetition: Repetition of data may occur
if it is displayed in an unorganized fashion. Tabulation
saves the data from being repeated.
DIFFERENCES BETWEEN TABULATION AND CLASSIFICATION
Tabulation and classification of data are the two processes
which are highly essential for any research investigation.
However, there are following differences between the two:
1. Classification precedes tabulation. First data is classified
and then it is tabulated. In this way, classification
provides basis to the process of tabulation.
2. Classification of data is done on the basis of similarities
and dissimilarities between the items. Whereas in
tabulation, the data is divided into columns and rows.
3. Tabulation is a method of data presentation, whereas
classification is the method of data preparation.
CONSTITUENTS OF A TABLE
A table is a presentation of data based on classified data
collected from primary as well as secondary sources.
Following are the essential parts of a table:
1. Title: Title of a table is very important in showing the
relevance of the figures in a few words. Titles should be
written in clear and precise words. No table should be
prepared without an appropriate title.
2. Captions and SubCaptions: Every column must have
its own caption and every subcolumn should have sub
caption.
Stub is the name for the titles of rows. The box that is on
the left side corner of the table over the stubs expresses
description of the sub contents and data in each row is
tabled by these stubs.
3. Main Body: Body of the table is the major portion, in
which all numerical information is filed.
4. Table Number: For easy identification each table must
be codified with a number.
5. Ruling and Spacing: A table must be well knit and
drawn neatly and for this purpose ruling and spacing
should be adjusted properly.
6. Arrangements of Items: Items should be wellarranged
to facilitate comparison.
7. Notes: Notes expressing the facts must be gives in order
to classify the contents of the table. For reference, there
must be foot notes.
8. Source of Data: The data have their source and the
same must be mentioned.
TYPES OF STATISTICAL TABLES
Statistical tables are of following major types:
1. Simple and Complex Tables
2. Oneway Tables or Single Tables
3. Twoway Tables or Double Tables
4. Threeway Tables
5. Higher Order Tables or Manifold Tables
Given below is the description of these tables with examples:
1. Simple and Complex Tables: Tabulation of data may be
of two types, i.e., Simple or Complex. Simple Tabulation
furnishes information about one or more groups of
independent hypothesis.
Complex Tabulation represents the division of data in two or
more categories and it gives information about one or more
sets of interrelated hypotheses.
2. Oneway Tables or Single Tables: Oneway tables are
one of the forms of simple tabulation. Oneway tables give
answers to question about one characteristics of data
only.
This point can be illustrated through the table in Box 3E
BOX 3E
Example of Oneway Table
Marks Obtained by Fifty Students in Hindi
Marks Number of Students
3040 7
4050 18
5060 10
6070 8
7080 7
Total 50
This table supplies information about the particular number
of students and marks obtained by them. For example, 18
students secured marks between 40 to 50. This shows marks
obtained by students in one subject i.e., Hindi only.
The question that can be answered from this table would be
independent or each other.
3. Twoway Tables or Double Tables: Twoway tables give
information about two interrelated characteristics or a
particular phenomenon. For instance, if the number of
students given in the table is divided on the basis of
gender the table would become a twoway table, because
it would give information about two characteristics, i.e.,
(i) the marks obtained by students in Hindi and (ii) the
genderwise distribution of students in various class
intervals of marks.
BOX 3F
Example of a Twoway Table
Marks obtained by 100 students in Hindi classified according to
Gender
Marks Number of Students
Males Females Total
3040 5 6 11
4050 23 14 37
5060 15 12 27
6070 5 12 17
7080 3 5 8
Total 51 49 100
From this table two questions can be answered. For instance 27
students secured marks between 50 to 60 and out of it 15 were
males and 12 were females.
4. Threeway Tables or Treble Tables: In this table three
interrelated phenomena can be studied. A threeway
table can answer questions relating to three interrelated
problems. To add to the table in Box 3F above, the
number of students living in hostels and number of
students who were day scholars, a threeway table can
answer the questions as listed below:
i) Marks obtained by the students
ii) The distribution of these students gender
iii) The distribution of the students on the basis of
residence.
5. Higher order tables or Multiple tables: A multiple
table can provide information about a large number of
interrelated questions. If in the above examples
additional information is given about Marital status of the
students, it would become a fourway table and similarly
tables can be of still higher order, fiveway, sixway, and
so on. Such tables are called multiple or higher order
tables.
BOX 3G
Example of a Threeway Table
Marks obtained by 100 students in Hindi (Genderwise and on the Basis of residence)
Number of Students
Marks Males Females Total
Hostelers Day
Scholars
Total Hostelers Day
Scholars
Total Hostelers Day
Scholars
Total
3040 3 5 8 2 4 6 5 9 14
4050 10 10 20 7 9 16 17 19 36
5060 5 5 10 2 8 10 7 13 20
6070 2 3 5 4 6 10 6 9 15
7080 5 3 8 2 5 7 7 8 15
Total 25 26 51 17 32 49 42 58 100
SOLVED PROBLEMS IN DATA PROCESSING
Ex. 1 Indicate True or False
1. In comparison to a data array, the frequency distribution
has the advantage of representing data in compressed
form.
2. A single observation is called a data point whereas a
collection of data is known as tabular.
3. The classes in any relative frequency distribution are both
all inclusive and mutually exclusive.
4. Before information is arranged and analyzed using
statistical methods, it is known as preprocessed data.
5. One advantage of data array is that it does not allow us
to easily find the highest and lowest values in the data
set.
6. Discrete data can be expressed only in whole numbers.
7. As a general rule, statisticians regard a frequency
distribution as incomplete if it has fewer than 20
classes.
8. A data array is formed by arranging raw data in order of
time of observation.
9. A frequency distribution organizes data into group of
values describing one or more characteristics of the
data.
10.A series of rectangles, each proportional in width to the
range of values within a class and Proportional in
height to the number of items falling in the class is
called a frequency polygon.
11.The class width of a frequency distribution are equal of
size.
Answer: (1)  T (2)  F (3)  T (4)  F
Key (5)  F (6)  F (7)  F (8)  F
to Ex. 1 (9)  T (10)  F (11)  F
Ex. 2 Choose the most appropriate answer.
1. Which of the following represents the most accurate scheme
of classifying
data?
a) Quantitative methods
b) Qualitative methods
c) A combination of quantitative and qualitative methods.
d) A scheme can be determined only with specific information
about situation.
2. Which of the following is NOT an example of compressed
data?
a) Frequency distribution
b) Data array
c) Histogram
d) Ogive
3. Why is it true that classes in frequency distributions are all
inclusive? a) No data point falls into more than one class.
b) There are always more classes than data points.
c) All data fit into one class or another.
d) All of these.
e) (a) and (c) but not (b).
4. When constructing a frequency distribution, the first step is
a) Divide the data into at least five classes.
b) Sort the data points into classes and count the number of
points in each class.
c) Decide on the type and number of classes for dividing the
data. d) None of these.
5. As the numbers of observations and classes increase, the
shape of a frequency polygon
a) Tends to become increasingly smooth.
b) Tends to become jagged.
c) Stays the same.
d) Varies only if data become more reliable.
6. Which of the following statements is true of cumulative
frequency ogives for a particular set of data?
a) Both "morethan" and "lessthan" curves have the same
slope.
b) "Morethan" curves slope up and to the right.
c) "Lessthan" curves slope down and to the right.
d) "Lessthan" curves slope up and to the right.
7. From an ogive constructed for a particular set of data.
a) The original data can always be constructed exactly.
b) The original data can always be reconstructed exactly.
c) The original data can never be approximated or
reconstructed, but valid
[Type text] Page 358
conclusions regarding the data can be drawn.
d) None of these.
e) (a) and (b) but not (c).
8. In constructing a frequency distribution for a sample, the
number of classes depends upon
a) Number of data points.
b) Range of the data collected.
c) Size of the population.
d) All of these.
e) (a) and (b) but not (c).
9. Which of the following statements is true?
a) The size of a sample can never be as large as the size of
the population
from which it is taken.
b) Classes describe only one characteristic of the data being
organized. c) As a rule, statisticians generally use between 6
and 15 classes.
d) All of these.
e) (b) and (c) but not (a).
10. As a general rule, statisticians tend to use which of the
following number of classes when arranging data?
a) Fewer than five.
b) Between one and five.
c) More than 30.
d) Between 20 and 25.
e) None of these.
11. Which of these is NOT a test for usability of data?
a) Source.
b) Contradiction of other evidence.
c) Missing evidence.
d) Number of observations.
e) None of these.
12. A relative frequency distribution presents frequencies in
terms of
a) Fractions.
b) Whole numbers.
c) Percentage
d) All of the above.
e) Both (a) and (c).
13. Graphs of frequency distributions are used because
a) They have a long history in practical applications.
b) They attract attention to data patterns.
c) They account for biased or incomplete data.
d) They allow for easy estimates of values.
e) Both (b) and (d).
14. Continuous data is differentiated from discrete data in that
a) Discrete data classes are represented by fractions.
b) Continuous data classes may be represented by fractions.
c) Continuous data takes on only whole numbers.
d) Discrete data can take on any real number.
Answers: 1 (d) 2 (b) 3 (c) 4(c)
5 (a) 6 (d) 7 (b) 8(e)
9 (c) 10 (e) 11 (e) 12(e)
13(e) 14 (b)
Ex.3 Fill in the blanks:
1. Double counting is a result of ... or ...... data.
2. It is found that 50 of 1,000 customers in a survey contain
the relevant characteristics of all customers in the survey.
The 50 customers are asample.
3. The .... .... and the ...... are two methods of data
arrangement.
4. Ais a collection of all the elements in a group. A
collection of some, but not all, of these elements is
a...............
5. Dividing data points into similar classes and counting the
number of observations in each class will give a.
distribution.
6. If data can take on only a limited number of values, the
classes of these data are called Otherwise, the classes
are called ..............
7. A relative frequency distribution presents frequencies in
terms of ..............or ..............
8. A graph of a cumulative frequency distribution is called
a..............
9. If a collection of data is called a data set, a single
observation would be called ..............
Answers: 1  Incomplete biased
Key to 2  Representative
Ex.3 3  data array, frequency distribution
4  population, sample
5 Frequency
6 discrete, continuous
7  fractions, percentage
8  ogive
9  Data point
Ex.4 Saraswati Rao, the CEO of Relaxo Mattresses Co., has just
obtained some raw data from a marketing survey that her
company recently conducted. The survey was taken to
determine the effectiveness of the nEW company slogan, "When
you've given up on the rest, Relaxo!" To determine the effect of
the slogan on the sales of Relaxo Mattresses, 20 people were
asked how many Mattresses per year they bought before and
after the slogan was used in the advertising campaign. The
results were as follows:
Before/After Before/After Before/After Before/After
4 3 2 1 5 6 8 10
4 6 6 9 2 7 1 3
1 5 6 7 6 8 4 3
3 7 5 8 8 4 5 7
5 5 3 6 3 5 2 2
a) Create both frequency and relative frequency distributions for
the "Before" responses, using as classes 1 to 2, 3 to 4, 5 to
6, 7 to 8, and 9 to 10.
b) Work part (a) for the "After" responses.
c) Give the most basic reason why it makes sense to use the
same classes for both the "Before" and "After" responses.
d) For each pair of "Before/After" responses, subtract the
"Before" response from the "After" response to get the
number that we will call "Change" (example: 34 = I), and
create frequency and relative frequency distributions for
"Change" using classes 5 to 4, 3 to 2 1 to 0, 2, 3 to 4 and
5 to 6.
e) Based on your analysis, state whether or not the new slogan
has helped sales, and give one or two reasons to support
your conclusion.
Answer to EX.4
a) BEFORE
Mattresses Bought Relative frequency Frequency
12 5 0.25
34 6 0.30
56 7 0.35
78 2 0.10
910 0 0.00
20 1.00
b) AFTER
Mattresses Bought Relative frequency Frequency
12 2 0.10
34 4 0.20
56 6 0.30
78 6 0.30
910 2 0.10
20 1.00
c) In order to be able to compare two distributions.
d) Change class Frequency Relative frequency'
5 to  4 1 0.05
3 to 2 0 0.00
1 to 0 5 0.25
1 to 2 8 0.40
3 to 4 5 0.25
5 to 6 1 0.05
20 1.00
e) Sales appears to have increased, but the apparent
increases could be due to other factors we do not know
about. We can not say for sure that the new slogan has
helped.
Ex.5 The M.D. of Kingfisher is trying to estimate when the
Director General of Civil Aviation (DGCA) is likely to decide on
its application for a new route between Dehradun and Mumbai.
The route planning department of the airline has put together
the typical waiting term for application processing over the last
year. The data are provided in terms of number of days from
filing of application to the DGCA ruling.
34 40 23 28 31 40 25 33 47 32
44 34 38 31 33 42 26 35 27 31
29 40 31 30 34 31 38 35 37 33
24 44 37 39 32 36 34 36 41 39
29 22 28 44 51 31 44 28 47 31
a) Construct a frequency distribution using 10 closed
intervals, equally spaced. Which interval occurs most often?
b) Construct a frequency distribution using five closed
intervals, equally spaced. Which interval occurs most often?
c) If the M.D. of Jet Airlines had a relative frequency
distribution for either (a) or (b), would that help him
estimate the answer he needs?
Answer to Example 5
a) WAITING TIME b) WAITING TIME
Days Frequency Days Frequency
2224 3 2227 6
2527 3 2833 18
2830 6 3439 14
3133 12 4045 9
3436 8 4651 3
3739 6
50 4042 5
4345 4
4648 2
4951 1
50
Yes, he would want to know the relative proportions at each
level.
SUMMARY
In this chapter definition of data processing and its elements
have been described. Definition of Data Editing, and steps
involved in editing have been explained. Methods of
approximations have been discussed. Types/stages of editing
and precautions to be taken in editing have been included. The
concepts of coding and classifications have been described.
Basis of classifications have also been highlighted. Tabulation its
meaning, objectives, constituents and its types have been dealt
in detail.
REVIEW QUESTIONS
1. Processing of data implies editing, coding, classification
and tabulation. Describe in brief these four operations
pointing out the significance of each in context of
research study.
2. Classification according to class intervals involves three
main problems viz., how many classes should be there?
How to choose class limits? How to determine class
frequency? State how these problems should be tackled
by a researcher.
3. Distinguish between
a) Field editing and central editing
b) Statistics of attributes and Statistics of variables
c) Exclusive type and inclusive type of class interval
d) Simple and complex tabulation
4. State the various methods of approximation and their
utility in statistics.
5. What is classification? What are the characteristics of an
ideal classification? Discuss its advantages and
limitations.
6. Describe different basis of classifications and explain each
of them with suitable illustrations.
7. Define tabulation. What are the methods of constructing a
table?
8. What are the different types of statistical tables? Explain
with examples.
CHAPTER 10 STATISTICAL ANALYSIS AND
INTERPRETATION OF DATA:NON
PARAMETRIC TESTS
Objectives
After reading this chapter learner would be able:
 Understand the concept and advantages of the non
parametric tests
 Appreciate the limitation of nonparametric tests and
methods
Structure
 Definition of Parametric Tests
 Advantages of Non Parametric Tests
 Limitation of Non Parametric Tests
 Listing of Non Parametric Methods
 The Sign Test
 A Rank Sum Test
 Kruskal Wallis or H Test
 One Sample Run Test
 Kolmogorov Smirnov Test
 Spearmans Rank Correlation
 Summary
 Review Questions
 Further Readings
DEFINITION OF NONPARAMETRIC TESTS
Statistical techniques that do not make restrictive assumptions
about the shape of a population distribution when performing
test of a hypothesis is known as nonparametric Tests. Since
these tests do not depend on the shape of the distribution, they
are also known as distribution free tests.
These tests do not depend upon the population parameter, such
as mean and variance.
ADVANTAGES OF NON PARAMETRIC TESTS
Advantages of these tests are as follows:
i) Nonparametric tests are distribution free, i.e.,
they do not require any assumption to be made
about population following normal or any other
distribution.
ii) Generally they are simple to understand and easy
to apply when the sample sizes are small.
iii) Most nonparametric tests do not require lengthy
and labourious computations and hence are less
timeconsuming. If significant results are obtained
no further work is necessary.
iv) These tests are applicable to all types of data
qualitative (nominal scaled) data in rank from
(ordinal scaled) as well as data that have been
measured more precisely (interval or ratio scaled)
v) Many nonparametric methods make it possible to work
with very small samples. This is particularly helpful to the
business researcher collecting pilot study data or to the
medical researcher working with a rare disease.
Activity 10.1
Would you apply nonparametric test for a consumer behavior
survey evaluating Cadbury Chocolates new packaging? If so,
briefly state the reasons!













LIMITATION OF NON PARAMETRIC TESTS
The limitations of nonparametric tests are as follows:
1. These techniques are less efficient or powerful than the
corresponding standard techniques.
2. Assertion made with equal confidence require larger
samples, if they are made without knowledge of the form
of the underlying distribution than if they are made with
such knowledge.
3. As the sample size gets larger data manipulations
required for nonparametric procedure are sometimes
labourious unless appropriate computer software is
available.
4. They ignore a certain amount of information.
LISTING OF NON PARAMETRIC METHODS
The following are some of the important nonparametric
methods commonly used in the business research
investigations:
i. Sign Test for Paired Data, where positive or negative
sign are substituted for quantitative values.
ii. A Rank Sum Test, which is used to determine whether
two independent samples have been drawn from the
same population.
iii. One Sample Run Test: A method used for determining
the randomness with which sampled items have been
selected.
iv. The Kruskal Wallis or H Test, which generalizes the
analysis of variance to enable us to dispense with the
assumption that the population are normally distributed.
v. Rank Correlation: When data are not available but data
can be ranked in relation to each other used for doing
correlation analysis.
THE SIGN TEST
The sign test is the simplest of the nonparametric tests. The
name is derived from the fact it is based on the sign (i.e., pluses
or minuses) of a pair of observations and not on their numerical
magnitude. In any problem in which sign test is used we count
number of +ve signs, number of ve sings, number of 0s (i.e.,
which can not be included either as positive or negative).
The sign test is of the following two types:
i) The One Sample Sign Test: In a one sample sign test we
test the null hypothesis = H
o
against an appropriate
alternative on the basis of a random sample size n, we
replace each sample value greater than
H
o
with a plus
sign and each sample value less than H
o
with a minus
sign and discard sample value exactly equal to H
o
(put
0). We then test the null hypothesis that these plus and
minus signs are values of a random variable having the
binominal distribution with p = .
We take H
o
: p=0.5 (Null hypothesis). If difference is due
to chance effects the probability of a +ve sign for any
particular pair is as is the probability of a ve sign. If
S is the number of times less frequent sign occurs, then S
has the binominal distributions with p=. The critical
value for a two sided alternative at = 0.05 (level of
significance at 5%) can be calculated by:
K =
Ho is rejected if S K for the sign test.
ii) The Paired Sample Sign Test: The sign test has very
important application in problems involving paired data
such as data relating to the collection of an accounts
receivable before and after a new collection policy,
responses of father and son towards ideal family size etc.
In these problems, each pair of sample values can be
replaced with a plus sign if the first value is greater than
the second and a minus sing if the first value is smaller
than the second or be discarded if the two values are
equal.
APPLICATION
The paired sign test is mot often employed for observations that
have been randomly selected in pairs, when design of
experiment is of paired difference type.
ADVANTAGES OF SIGN TESTS:
i) It is simple to use and understand.
ii) Easy to calculate and less timeconsuming
iii) Most powerful tool when no value of data is
available except greater than or less than values.
LIMITATIONS OF SIGN TESTS
This test is generally used only when values of n are less than
30.
A RANK SUM TEST
The WilcoxonMannWhitneyU Test or test is a popular test of
family of rank sum tests. Here we shall discuss only one of the
types, the Wilcoxon MannWhitneyU Test. With this test we can
test the null hypothesis without assuming whether the
populations sampled have roughly the shape of normal
distribution.
This test helps us to determine whether two samples have come
from identical population. If it is true that the samples have
come from the populations it is reasonable to assume that the
means of the ranks assigned to the values of the two samples
are more or less the same. The alternative hypothesis is that
the means of the population are not equal and if this is the case,
most of the smaller rank will go to the values of one sample,
while most of the higher ranks will go to those of the other
sample.
The test of the null hypothesis that the two samples come from
the identical populations may either be based on R
1,
the sum of
the ranks of the values of first sample, or on R
2
, the sum of the
ranks of the values of the second sample. It may be noted that
in practice it does not matter which sample we call sample 1
and which we can sample 2.
If the sample sizes are n
1
and n
2
the sum of R
1
and R
2
is simply
the sum of first n
1
+ n
2
positive integers, which is know to be
This formula enables us to find R
2
if we know R
1
and vice versa.
When the use of the rank sums was first proposed as a non
parametric alternative to the twosample ttest, the decision
was based on R
1
or R
2
, but now the decision is usually based on
either of the related statistics:
When n
1
and n
2
are the size of the samples and R
1
and R
2
are
the rank sums of the corresponding samples., For small
samples, if both n
1
and n
2
are less than 10 (some statisticians
say 8) special tables must be used, and if U is smaller than the
critical value, the null hypothesis in rejected where U is smaller
than U
1
& U
2
. For a large sample, U follows a normal
distribution.
In using this statistic, it is unimportant whether the larger or
smaller value obtained from the formulae is used. The values for
Z will be numerically equal, but opposite in sign. Note that tied
observations are again given the mean of the common ranks.
KRUSKAL WALLIS OR H TEST
Another test belonging to family of Rank sum tests. This test
helps in testing the null hypothesis that k independent random
samples come from identical populations against the alternative
hypothesis that the means of these population sample are not
equal.
As is done in the Utest all data are ranked as if they were in
one sample, from lowest to highest, the rank sums of each
sample are calculated. The Hstatistic is calculated from the
formula:
when n
1
, n
2
. n
k
are the number in each of k samples, n =
n1+n2+ . +nk, and R1, R2. .. Rk are the rank sums of each
sample.
If the null hypothesis is true and each sample has at least five
observations, the sampling distribution of H can be
approximated closely with a chisquare distribution with (k1)
degree of freedom. Consequently, we can reject the null
hypothesis that 1 = 2= k and accept the alternative that the
s are not all equal at the level of significance (alpha), if H>
2
for k1 degree of freedom. If any sample has less than five
items, the
2
approximation cannot be used, and the test must
be based on special tables.
ONE SAMPLE RUN TEST
This test was evolved in order to make a judgment about the
randomness of sample on the basis of the order the
experimental observations are made. In many an application it
is difficult to judge whether the sample used is random or not,
specially when the researcher has no control over selection of
data. Therefore a concept of run has been developed by the
statician to help researcher make such a judgment.
A run is a succession of identical letters (or other kinds of
symbols) which is followed or preceded by different letter or no
letter at all. To illustrate consider the following arrangement of
occurrence of X or Y chromosomes:
XXX YYYY XXXX YYYYYY XX YYY
Using underscores to combine the letters which constitute the
runs, we find that first there is a run of there Xs, then a run of
four Ys, then a run of four Xs, than a run of six Ys, then a run
of two Xs and lastly a run of three Ys.
It may be pointed out that the total number of runs appearing in
an agreement is often a good indication of a possible lack of
randomness. If there are too few runs, a definite grouping,
clustering or trend may be suspected. On the other hand, if
there are too many runs some sort of repeated alternating
pattern may be suspected. Thus, it may be possible to prove
that too many or too few runs in a sample indicate something
other than chance when the items were selected.
The number of runs, or r, is a statistic with its own special
sampling distribution and its won test. To derive the mean of
the sampling distribution of r statistic, the following formula is
used:
1 = mean of the r statistic
The standard error of the r statistic is calculated by the formula.
It may be noted that the sampling distribution of r can be
closely approximated by the normal distribution if either n1, or
n2 is larger than 20
KOLMOGOROV SMIRNOV TEST
The Kolmogrov _Smirnov test is a simple nonparametric
method for testing whether there is a significant difference
between an observed frequency distribution and a theoretical
frequency distribution. The K:S test is, therefore, another
measure of the goodness of fit of a theoretical frequency
distribution.
ADVANTAGE
i. It is a powerful nonparametric test.
ii. It is easier to use since it does not require that data be
grouped in any way
iii. The KS statistic D
n
is particularly useful for judging how
close the observed frequency (F0) distribution is to the
expected frequency distribution, because the probability
distribution of D
n
depends on the sample size a
Distribution free statistic).
Calculating the KS statistic
D
n
= max (F
e
F
o
)
Can be calculated simply by picking out D
n
. The maximum
absolute deviation of F
e
from F
o
.
SPEARMANS RANK CORRELATION
This method of finding out co variability or lack of it , between
two variables was developed by the British Psychologist C.E.
Spearman in 1904. This measure id specially useful when
quantitative measure for certain factors (such as in evaluation of
leadership ability or the judgment of female beauty) cannot be
fixed, but the individual in the group can be arranged in order
thereby obtaining for each individual number indicating his/her
rank in the group.
Spearmans Rank correlation is defined as
where R
s
= Rank coefficient of correlation
d = Difference of Rank between paired items in two series.
n = No. of observations.
The standard error of the rank correlation coefficient is given by
ADVANTAGES
i) It is simple and easy to understand.
ii) It is a powerful tool of determining covariability
among attributes such as honesty, beauty etc.
iii) Easy to calculate.
LIMITATIONS
i) It is not accurate as the Karl Peaarsons coefficient
of correlation i.
ii) It should not be used where N is more than 30,
unless the original dath are ranks instead of scores.
Rs =1
SOLVED EXAMPLES
Ex. 1 Indicate True or False
1. One advantage of nonparametric methods is that some
of the tests do not require us even to rank the
observations.
2. The MannWhitney U test is one of a family of tests
known as rankdifference tests.
3. A sign test for paired data is based upon the binominal
distribution but can often be approximated by the normal
distribution
4. One disadvantage of nonparametric methods is that they
tend to ignore a certain amount of information.
5. In the MannWhitney U test, two samples, of size n, and
n2 are taken to determine the U statistic. The sampling
distribution of the U statistic can be approximated by the
normal distribution when either n1 or n2 is greater than
10.
6. The MannWhitney U test, tends to waste less data than
the sing test.
7. Assume that in a rank test, two elements are tied on the
tenth rank position. We assign each of them a rank of
10.5 and the next element after these two receives a rank
of 11.
8. In contrast to regression analysis, where one may
compute a coefficient of correlation and equivalent
measure may be determined in a ranking of two variables
in nonparametric testing. This equivalent measures is
called a rankcorrelations coefficient.
9. In a onesample runs test, the number of runs is a
statistic which has its own sampling distribution.
10.One disadvantage in using the rankcorrelation coefficient
is that it is very sensitive to extreme observations in the
data set.
11.The KolmogorovSmirnov test can be used to measure
the goodness of fit of a theoretical distribution.
12.Nonparametric methods are more efficient that
parametric methods.
13.The onesample runs test enables us to determine
whether tow independent samples have been drawn from
populations with the same distribution.
14.The sequence A, A, B, A, B contains four runs.
15.A rank correlation coefficient of 1 represent a perfect
inverse rank correlation.
16.In onesample runs test, the alternative hypothesis is that
the sequence of observations is not random.
17.In the MannWhitney U test, it is not necessary that the
two samples be of the same size.
18.The K S test statistic is simply the minimum absolute
deviation between the observed relative cumulative
frequencies and the expected relative cumulative
frequencies.
[Type text] Page 384
population means are equal, provided the populations are
normally distributed with equal variances.
20.The KruskalWallis test is a nonparametric version of
ANOVA.
21.The sampling distribution of the KruskalWallis K statistic
can be approximated by a chisquare distribution only if
all sample sizes are at least 5.
Answer Key to Example 1:
Ex. 2 Fill up the blanks
i. A sequence of identical occurrences preceded and
followed by different occurrences or none at all is a
_______________.
ii. A nonparametric method used to determine whether two
independent samples have been drawn from populations
with the same distribution is the __________.
IT 2F 3T 4T 5F
6T 7F 8T 9T 10F
11T 12F 13F 14T 15T
16T 17T 18F 19F 20T
21T
iii. A nonparametric technique for determining the
randomness with which sampled items have been
selected is the _______________.
iv. A _________ test tests for the difference between paired
observations by substituting +, , and 0 for quantitative
values.
v. A _____________ coefficient measures the degree of
association between two variables and is based on the
ranks of the observations.
vi. The U statistic has a special property that enables us to
save computational time when _______________.
vii. To distinguish it from the coefficient of correlations, the
rank coefficient of correlation is
dentoted__________________.
viii. The KS statistic Dn is a _____________ statistic in that
it is independent of the expected frequency distribution.
ix. The ______________ test has advantages over the chi
square test for goodness of fit because the data need not
be grouped in any way.
Answer Key to Example 2
i) run
ii) MannWhitney U Test
iv) Sign
[Type text] Page 386
v) Rank Correlation
vi) The sample sizes are unequal.
vii) rs
viii) Distribution free
ix) Kolmogorov Smirnov.
Ex.3 The Null Hypothesis most often examined in non
parametric tests:
a) Includes specification of a population parameter.
b) Is used to evaluate some general population aspect.
c) Is very similar to that used in regression analysis.
d) Simultaneously tests more than two population
parameters.
Answer to Example 3: (b)
Answer 4 : They do not use all the information in data, since
they usually rely on ranks or counts.
Ex. 5 A typing school claims that in a xisweek intensive
course, it can train students to type, on the average, at
least 60 words per minute. A random sample of 15
graduates is given a typing test and the median number
of words per minute typed by each of these students is
given below. Test the hypothesis that the median typing
speed of graduates is at least 60 words per minute.
Students Words per
minute
WPM
60
A 81 + 21
B 76 +16
C 53 7
D 71 +11
E 66 +6
F 69 1
G 88 +28
H 73 +13
I 80 +20
J 66 +6
K 58 2
L 70 +10
M 60 0
N 56 4
O 55 5
S = 5
Answer to Example 5
Using to the experimental procedure, we have:
1. H0 : Md = 60
H1 : Md> 60
2. Level of Significances; 5 per cent
Criteria: Since the alternate hypothesis is onesided, we obtain
the probability of our result from above table and compare them
at 0.05. If the probability is less than 0.05, we reject H0.
No. of + signs = 9
No. of signs = 5
No. of 0s = 1
Total = 15
K =
=
=6.53.67 = 2.83
Since S> K the null hypothesis is accepted.
Ex. 6 Because of the severity of recent winters, there has been
talk that the earth is slowly progressing toward another ice age.
Some scientists hold different views, however, because the
summers have brought extreme temperatures as well. One
scientist suggested looking at the mean temperature for each
month to see if it was lower that in the previous year. Another
meteorologist at the government weather service argued that
perhaps they should look as well at temperatures in the spring
and autumn months of the last 2 years, so that their conclusions
would be based on other than extreme temperatures. In this
way, he said they could detect whether there appeared to be a
general warming or cooling trend or just extreme temperatures
in the summer and winter months. So 15 dates in the spring
and autumn were randomly selected, and the temperatures in
the last 2 years were noted for a particular location with
generally moderate temperatures. Following are the dates and
corresponding temperatures for 1991 and 1992.
a) Is the meteorologists reasoning as to the method of
evaluation sound? Explain.
b) Using a sign test, determine whether the meteorologist
can conclude at = 0.05 that 1992 was cooler than 1991,
based on these data.
Temperature (Fahrenheit)
Date 1991 1992 Date 1991 1992
Mar. 29 58 57 Oct. 12 54 48
Apr. 4 45 70 May. 31 74 79
Apr. 13 56 46 Sept. 28 69 60
May 22 75 67 June. 5 80 74
Oct. 1 52 60 June. 17 82 79
Mar. 23 49 47 Oct. 5 59 72
Nov. 12 48 45 Nov. 28 50 50
Sept.30 67 71
Answer to Example 6
a) No, even if 1991 is significantly cooler than 1992, that
alone is not strong evidence of a long run trend toward
cooler weather.
Temperature (Fahrenheit)
Date
1991
(i)
1992
(ii)
Sign
(iii)
Mar. 29 58 57 +
Apr. 4 45 70 
Apr. 13 56 46 +
May 22 75 67 +
Oct. 1 52 60 
Mar. 23 49 47 +
Nov. 12 48 45 +
Oct. 12 54 48 +
May. 31 74 79 
Sept. 28 69 60 +
June. 5 80 74 +
June. 17 82 79 +
Oct. 5 59 72 
Nov. 28 50 50 0
[Type text] Page 391
Here + signs = 9,  sings = 5, and 0 = 1
The total sample is = 14 (n)
S = 4
K =
=
=6.53.67 = 2.83
Since S > K we accept null hypothesis 1992 was not
significantly cooler than 1991.
Ex. 7 Use the sign test to see if there is a difference between
the number of days until the collection of an account
receivable before and after a new collection policy. Use
the 0.05 significance level.
Before: 30 28 34 35 40 42 33 38 34 45 28 27 25 41 36
After: 32 29 33 32 37 43 40 41 37 44 27 33 30 38 36
Answer to Example 7
Before (1st) Afer (2nd) ( 1st  2nd )
30 32 
28 29 
[Type text] Page 392
34 33 +
35 32 +
40 37 +
42 43 
33 40 
38 41 
34 37 
45 44 +
28 27 +
27 33 
25 30 
41 38 +
36 36 8
S=8
K =
=
=6.53.67 = 2.83
Since S > K the null hypothesis is accepted. There is not
significant difference before and after the new collection policy
in the accounts receivable.
Ex. 8 The following data relate to the daily production of
cement (in m. tones) of Vikram Cement Ltd., Jawad,
Mandsaur District, a large plant for 30 days
11.5 10.0 11.2 10.0 12.3 11.1 10.2 9.6 8.7 9.3
9.3 10.7 11.3 10.4 11.4 12.3 11.4 10.2 11.6 9.5
10.8 11.9 12.4 9.6 10.5 11.6 8.3 9.3 10.4 11.5
Use sign test to test the null hypothesis that the plants average
daily production of cement H
o
=11.2 m. tones against
alternative hypothesis H1 < 11.2 m. tones at the 0.05 level of
significance.
Answer to Ex. 8
 Putting +,  sings, we get
+    +     
  +  + + +  + 
 + +   +    +
Number of + sings = 11
Number of sings = 18
Number of zeros = 1
Sample Size = 30
For large samples, generally considered n>25 for the sign test,
the normal approximation to the binominal may be used,
correcting for continuity. Since p = 0.50 for this, we have the
mean equal and the standard deviation equal to The
actual value of z can be computed using the formula.
Z=
Where x is the number of plus signs. The value obtained can
then be compared to the critical value of Z which is appropriate
for the direction of the test. As mentioned before, in the event
of ties, all sign changes of 0 are dropped before evaluating the
results.
X = 11, n = 30, P
o
=
Substituting the values in the formula
Z= = = = 1.46
Since this is less than Z 0.05 = 1.645, the null hypothesis is
rejected; hence production is less than 11.2 in tones.
Ex. 9 Area Exchange Bhopal has been keeping track of the
number of senders that were in use at a given instant.
Observations were made on 3754 different occasions. For capital
investment planning purpose, the finance officer of this
organization thinks that the patterns of usage follows a Poisson
distribution with a mean of 8.5. Test the hypothesis at 0.01
level of significance. Use KS test.
Answer to Example 9
Let H
o
: A Poisson Distribution with = 8.5 is (Null Hypothesis)
good description of the pattern of usage.
= 0.01
(Level of significance for testing the hypothesis)
Observed frequencies
No. Busy Observed frequency
0 0
1 5
2 14
3 24
4 57
5 111
6 197
7 278
8 378
9 418
10 461
11 433
12 413
13 358
14 219
15 145
16 109
17 57
18 43
19 16
20 7
21 8
22 3
Transform the observed frequency into observed Cumulative
frequency, observed Relative cumulative frequency as shown in
Table A below.
Table A for Example 9
Observed and Relative Cumulative Frequencies
Number
Busy
Observed
Frequency
Observed
Cumulative
Frequency
Observed
Relative
Cumulative
Frequency
0 0 0 0.0000
1 5 5 0.0013
2 14 19 0.0051
3 24 43 0.0115
4 57 100 0.0266
5 111 211 0.0562
6 197 408 0.1087
7 278 686 0.1827
8 378 1,064 0.2834
9 418 1,482 0.3948
[Type text] Page 397
Use
the
Poiss
on
form
ula
to
comp
ute
the
expe
cted
frequ
encie
s.
10 461 1,943 0.5176
11 433 2,376 0.6329
12 413 2,789 0.7429
13 358 3,147 0.8383
14 219 3,366 0.8966
15 145 3,511 0.9353
16 109 3,620 0.9643
17 57 3,677 0.9795
18 43 3,720 0.9909
19 16 3,736 0.9952
20 7 3,743 0.9971
21 8 3,751 0.9992
22 3 3,754 1.0000
Calculate the absolute deviation for = 0 to 22. The results are
tabulated in Table B.
To compute the KS static, calculate Dn1, the maximum absolute
deviation of Fe and Fo
D
n
= Maximum (F
e
F
o
)
Table B for Example 9
Number
Busy
Observed
Frequency
Observed
Cumulative
Frequency
Observed
Relative
Cumulative
Frequency
Expected
Relative
cumulative
Frequency
IF  FI
Absolute
Deviation
0 0 0 0.0000 0.0002 0.0002
1 5 5 0.0013 0.0019 0.0006
2 14 19 0.0051 0.0093 0.0042
3 24 43 0.0115 0.0301 0.0186
4 57 100 0.0266 0.0744 0.0478
5 111 211 0.0562 0.1496 0.0934
6 197 408 0.1087 0.2562 0.1475
7 278 686 0.1827 0.3856 0.2029
8 378 1,064 0.2834 0.5231 0.2397
9 418 1,482 0.3948 0.6530 0.2582
10 461 1,943 0.5176 0.7634 0.2458
11 433 2,376 0.6329 0.8487 0.2158
12 413 2,789 0.7429 0.9091 0.1662
13 358 3,147 0.8383 0.9486 0.1103
14 219 3,366 0.8966 0.9726 0.0760
15 145 3,511 0.9353 0.9862 0.0509
16 109 3,620 0.9643 0.9934 0.0291
17 57 3,677 0.9795 0.9970 0.0175
18 43 3,720 0.9909 0.9987 0.0078
19 16 3,736 0.9952 0.9995 0.0043
20 7 3,743 0.9971 0.9998 0.0027
21 8 3,751 0.9992 0.9999 0.0007
22 3 3,754 1.0000 1.0000 0.0000
In this problem Do = 0.2582 at x = 9
The critical value of Do is computed by formula, at =
0.1 at N = 3754.
Since calculated value 0.0266 is less than 0.2582, hence we
reject H
o
null Hypothesis.
Ex. 10 PONDS INDIA LTD. has organized a beauty contest to
select advertising model for the newly launching soap
product. Ten competitors are ranked by three judges.
Use Spearmans rank correlation coefficient to
determine which pair of judges has the nearest
approach to common tastes in beauty.
First Judge 1 6 5 10 3 2 4 9 7 8
Second
Judge 3 5 8 4 7 10 2 1 6 9
Third
Judge 6 4 9 8 1 2 3 10 5 7
Answer to Example 10
In order to find out which pair of Judges has the nearest
approach to common tastes in beauty we compare Rank
correlation between the judgments of:
i) 1
st
Judge and 2
nd
Judge
ii) 2
nd
Judge and 3
rd
Judge
iii) 3
rd
Judge and 1
st
Judge.
Rank Correlation between Judgments of 1
st
and 2
nd
Judge
Rs12
Rank Correlation between Judgments of 2
nd
and 3
rd
Judge
Rs12
= 11.297 =  0.297
Rank Correlation between Judgments of 3
rd
and 1
st
Judge
Rs 31 =
Rank by
1st Judge
R1
Rank by
2nd Judge
R2
Rank by
3rd Judge
R3
(R1R2)
2
(R3R2)
2
(R3R1)
2
1 3 6 4 9 25
6 5 4 1 1 4
5 8 9 9 1 16
10 4 8 36 16 4
3 7 1 16 36 4
2 10 2 64 64 0
4 2 3 4 1 1
9 1 10 64 81 1
7 6 5 1 1 4
8 9 7 1 4 1
N=10 N=10 N=10 D1=200 D2=214 D3= 60
= 1 0.3636 = 0.636
Since coefficient of correlation is maximum in the judgments of
1
st
and third judge, we conclude that they have the nearest
approach to common tests in the beauty.
Ex. 11 Quotations of Index Numbers of security prices at
Bombay Stock Exchange of Arvind Mills Ltd., of
Ahemedabad are given below:
Using rank correlation method, determine the relationship
between debenture prices and share prices.
Answer to Example 11
First assign ranks and calculate rank correlation coefficient by
formula:
Year Debenture Price Share Price
1989 97.8 73.2
1990 99.2 85.8
1991 98.8 78.9
1992 98.3 75.8
1993 98.4 77.2
1994 96.7 87.2
1995 97.1 83.8
Year
Debenture
Price
x
Rank
Rx
Share Price
y
Rank
Ry
(Rx
Ry)2
1989 97.8 3 73.2 1 4
1990 99.2 7 85.8 6 1
1991 98.8 6 78.9 4 4
1992 98.3 4 75.8 2 4
1993 98.4 5 77.2 3 4
1994 96.7 1 87.2 7 36
1995 97.1 2 83.8 5 9
N= 7 N=7 D2 = 62
= = 1 = 1 1.107 = .0107
There is a negative correlation between debenture prices and
share price.
Ex. 12 Two women customers are randomly selected in a
super market and are asked to test 7 different types of lipsticks
and rank them in order of preference (from 7 best to 1 least
desirable). The results are as follows. Calculate the Rank
correlation coefficient.
Answer to Example 12
Lipsticks A B C D E F G
Neelu 2 1 4 3 5 7 6
Sheelu 1 3 2 4 5 6 7
Neelu
x
R1
Sheelu
y
R2
(R1 R2)
D
D
2
2 1 +1 1
1 3 2 4
4 2 +2 4
3 4 1 1
5 5 0 0
7 6 +1 1
6 7 1 1
N=7 N= 7 D
2
=12
[Type text] Page 403
Rank Correlation coefficient =
=
There exists some consistency in ranking the brands by
customers.
Ex. 13 Calculate the coefficient of correlation from the following
data by the Spearmans Rank difference method.
Price of
Tea (Rs.)
Price of
Coffee
(Rs.)
75 120
88 134
95 150
70 115
60 110
80 140
81 142
50 100
Answer to Example 13
Price of Tea R1
Price of
Coffee R2
(R1
R2)2
75 4 120 4 0
88 7 134 5 4
95 8 150 8 0
70 3 115 3 0
60 2 110 2 0
80 5 140 6 1
81 6 142 7 1
50 1 100 1 0
SUMMARY
In this chapter the definition of nonparametric test has been
described. The advantages, limitations of nonparametric tests
have been explained. The various nonparametric methods viz.,
sign test. A rank sum test, One sample run test, the Krukal
Wallis or H test. Rank correlation test and this advantages,
limitations and areas of applications have been highlighted.
Typical real world problems and applications of the non
parametric tests have been discussed.
REVIEW QUESTIONS
1. Explain the term nonparametric test. What are the
advantages and limitations of nonparametric test? When
these techniques are used.
2. Distinguish between parametric test and nonparametric
tests.
3. Write Short Notes On:
a) The Sign test
b) The paired sample sign test
c) H test
d) KomolgrovSmirnov Test
4. Explain the MannWhitney U test, where it is used what
are the advantages of this test?
5. Explain Spearmans Rank correlation. What are the
advantages of this technique? Where it is used? Discuss
its limitations.
CHAPTER 11 MULTIVARIATE ANALYSIS OF
DATA
Objectives
After reading this Chapter learner would be able to:
 Understand the concepts of multivariate analysis
 Appreciate both advantages and disadvantages of
multivariate analysis.
 Know the classification of multivariate techniques and
their application.
Structure
 Definition of Multivariate Analysis
 Objective of Multivariate Analysis
 Advantage of Multivariate Analysis
 Disadvantages of Multivariate Analysis
 Applications of Multivariate Analysis
 Classification of Multivariate Techniques
 Summary
 Review Questions
 Further Reading
DEFINITION OF MULTIVARIATE ANALYSIS
Multivariate Analysis or multivariate techniques may be defined
as the collection of methods for analyzing data in which a
dependent variable is represented in terms of several
independent number of observations are available to define such
relationship. In brief, techniques that take account of the
various relationships among variable are termed multivariate
analysis or multivariate techniques. Mathematically, multivariate
analysis was defined by Takeuchi, Yanai and Mukherjee as,
forming a linear composite vector in vector space, which can be
represented in terms of projection of a vector on to certain
specified subspace.
Activity 11.1
List few situations in business that would require multivariate
analysis application.







OBJECTIVE OF MULTIVARIATE ANALYSIS
The basic objectives of multivariate analysis are:
i) To represent the collection of large set of data in a
simplified way, by transforming large number of
observations into a smaller composite scores.
ii) To predict the variability of the dependent variable based
on its covariance with all the independent variables.
iii) To classify individuals or objects into one of two or more
mutually exclusive and exhaustive groups on the basis of
a set of independent variables.
ADVANTAGE OF MULTIVARIATE ANALYSIS
The main advantage of multivariate analysis is that since it
takes into account more than one factor/element of independent
variables which affect the variability of dependent variable, the
conclusion drawn are more accurate. The conclusions are more
realistic and nearer to the real life situation.
DISADVANTAGES OF MULTIVARIATE
ANALYIS
i) It requires rather complex computation to arrive at a
satisfactory conclusion.
ii) Due to above fact and that large number of observations
for large number of variables need to be collected and
tabulated. It is a rather time consuming process.
iii) Obviously due to (i) and (ii) above at times multivariate
proves to be an expensive proposition, in terms of cost.
iv) Specialized trained staff is required to process and
analyse the complex data utilizing multivariate
techniques.
APPLICATIONS OF MULTIVARIATE
ANALYSIS
These techniques are successfully employed in the following
areas:
a) Econometrics or Decision Making in Economics:
Such as impact of inflation, money circulation, lowering of
tariffs etc. on price rice.
b) Sociological Decision Making: Such as divorce rates,
and their causeeffect relationships, with marriage, social
demographics and income levels.
c) Agrarian Predictions: Such as impacts of rain,
fertilizers and mechanization on agricultural yields per
acre.
d) Drug Testing: Such as impact of new drugs on the
main disease and other side effects.
[Type text] Page 411
structure, duties, penalty rates, etc., on the government
revenues.
f) Industrial Decision Making: Such as plant location
which depends on infrastructure, availability of raw
materials, distribution channel etc.
CLASSIFICATION OF MULTIVARIATE
TECHNIQUES
The following is the list of important multivariate techniques
used in decision making process:
I) Multiple Regression
II) Canonical Correlation Analysis
III) Factor Analysis
IV) Multivariate Analysis of Variance
V) Cluster Analysis.
VI) Multidimensional Sealing
VII) Multiple Discriminant Analysis
VIII) Latent Structure Analysis
The brief description of each technique is given below:
MULTIPLE REGRESSION
This technique is appropriate when we have a single metric or
criterion variable, which is a function of a number of other
explanatory variables. The main objective of multiple regression
technique is to predict the variability of the dependent variable
based on its covariance with all the independent variables. One
can easily find out the land of the dependent phenomenon
through multiple regression, analysis model, given the levels of
independent variables.
The main advantage of multiple regression is that it allows to
utilize more of the information available to us to estimate the
dependent variable resulting in greater accuracy in determining
the relationship.
Multi regression analysis involves threestep process:
Describe multiple regression equation.
For example equation 11.1 represents two independent
variables:
Y = a + b
1
x
1
+ b
2
x
2
.. (11.1)
Where Y = Estimated value corresponding to the dependent
variable a =Y intercept.
X
1
and X
2
= values of two independent variables
b
1
and b
2
= slopes associated
with X
1
and X
2
respectively.
We use following 3 equations to determine the values of a
1
b
1
and b
2
Y =na+b
1
X
1
+b
2
X
2
..(11.2)
X
1
Y=a X
1
+b
1
X
1
2
+b
2
X
1
X
2
..(11.3)
X
2
Y=a X
2
+b
1
X
1
X
2
+b
2
X
2
2
..(11.4)
Solving equation 11.2, 11.3, 111.4 we get the values of a
1
, b
1
and b
2
.
Similarly it can be extended to n number of independent
variables.
Example:1
A company is interested in raising the level of productivity of its
workers through training and skill development. The results of
training programme organized for this purpose in 1998 in terms
of number of programmers, cost of programmes and the
additional production monthwise are give in the following table :
Month
No. of Training
Programmes
Organized
Cost of Training
Programmes
(Rs. 000)
Additional
Production (units)
January 8 10.2 38.5
February 6 8.4 22.6
March 8 11.4 37.6
April 10 11.1 357
May 12 13.9 43.6
June 11 12.0 38.0
July 9 9.3 30.1
August 7 9.7 35.3
September 12 12.3 46.4
October 8 11.4 34.2
November 6 9.3 30.2
December 13 14.3 40.7
Fit the linear regression equation to show the effect of the number
of programmes organized and the cost of programmes in terms of
rise in the level of production. Determine the contribution of each
individual variable to total variation.
Solution: Let production be denoted by y number of training
programmes and their cost by x
1
and x
2
respectively.
Computations for Multiple Regression Analysis
Y x
1
x
2
x
1
y x
2
y x
1
x
2
(x
1
)
2
(x
2
)
2
38.5 8 10.2 308.0 392.70 80.6 64 104.04
22.6 6 8.4 135.6 189.84 50.4 36 70.56
37.6 8 11.4 300.8 421.11 89.6 64 125.44
35.2 10 11.1 352.0 390.72 111.0 100 123.21
43.6 12 13.9 532.2 660.04 166.8 144 193.21
38.0 11 12.0 418.0 456.00 132.0 121 144.00
30.1 9 9.3 270.9 279.93 83.7 81 86.49
35.3 7 9.7 247.1 342.41 67.9 49 94.09
46.4 12 12.3 556.8 570.72 147.6 144 151.29
34.2 8 11.4 273.6 389.88 91.2 64 129.96
30.2 6 9.3 181.2 280.86 55.8 36 86.49
40.7 13 14.3 529.1 582.01 185.9 169 204.49
432.4 110 133.1 4096.3 4,902.22 1,263.5 1072 1,513.27
To estimate the additional production (y) at a given level of
training programmes (x
1
) and its cost (x
2
) the regression
equation would be:
= a +b
1
x
1
+ b
2
x
2
Where, the normal equations of estimates are :
y =na +b
1
x
1
+ b
2
x
2
. (i)
x
1
y = a x
1
+b
1
x
2
+b
2
x
1
x
2
. (ii)
x
2
y = a x
2
+b
1
x
1
x
2
+ b
2
x
2
(iii)
Substituting the values in the normal equations
432.40 = 12a +110.0b
1
+ 133.10 b
2
(1)
4096.30 = 110a +1072.0b
1
+1263.50 b
2
(2)
4902.22 = 133.1a +1263.5b
1
+1513.27 b
2
(3)
Multiplying (1) by 110 and (2) by 12 we get,
47564.00 = 1320a + 12100.00b
1
+14641.00 b
2
(4)
49155.60 = 1320a +12864.00b
1
+15162.00 b
2
. (5)
Deducting (5) from (4) we get,
1591.6 =  764b
1
521 b
2
. (6)
Multiplying (1) by 133.1 and (2) by 110 we get,
545217.53 = 14641a + 142683.20 b
1
+ 168171.85 b
2
(7)
539244.20 = 14641a + 138985.00b
1
+ 166459.70 b
2
. (8)
Deducting (8) from (7) we get,
5973.33 = 3698.20b
1
+ 1712.15 b
2
. (9)
Multiplying (6) by 3698.20 and (9) by 764 we get,
5886055.1 = 2825424.8b
1
1926762.2 b
2
. (10)
4563624.1 = 282524.8 b
1
+ 1308182.6 b
2
. (11)
Deducting (11) from (10) we get,
1322431 =  618679.6 b
2
Substituting the value of b
2
in equation (9) we get,
5973.33 = 3698.20 b
1
+1712.15 (2.1375075)
3698.20 b
1
= 2313.5965
Substituting the values of b
1
and b
2
in equation (1) we get,
432.40 = 12a + (110 x 0.6256007) + (133.1 x 2.1375075)
12a = 432.40 68.816077 284.50225
The estimating regression equation will be ,
=6.59 +0.6256x
1
+ 2.1375 x
2
Collective contribution of an individual variable to the regression
function.
Example 2
Owner of bakery is interested in predicting his revenue from the
sale of bread. It is believed that sale of bread is dependent upon
expenditure on advertisement and price charged. The following
data are obtained from his records for previous 10 weeks:
Week
Sales Revenue
(Rs. Lakh) y
Expenditure on
Advertisement
( Rs. 000) x
1
Price per doz
(Rs.) x
2
1 12 14 46
2 8 10 44
3 20 18 38
4 18 20 48
5 14 16 44
6 10 14 50
7 4 8 44
8 16 16 38
9 8 4 48
10 20 18 42
Computer Output
On processing the above data on computer the output of Least
Square estimates was as follows :
Variable Regression Coefficient Standard Error of Coefficient
C 13.153920 11.962969
x
1
0.88458671 0.19624843
x
2
 0.27966553 0.24369642
R
2
= 0.79974
R
2
= adjusted = 0.74252
Sum of Squares Degrees of Freedom Mean Square F Statistic
219.12846
54.87154
2
7
109.56423
7.89879
13.87718
You are required to give
(i) The least square regression model for estimating the
revenue from sale bread considering expenditure on
advertisement and price charged.
(ii) Interpret the regression coefficient and test the
hypotheses.
(iii) Interpret the statistic F and R
2
.
(iv) Compare the actual for past ten weeks with the
estimates of sales revenue.
Solution:
(i) The least square regression model will be:
= 13.153920 + 0.88458671 x
1
0.27966553 x
2
(ii) Significance test for regression coefficients :
(a) Constant t (7)
(b) x
1
, t (7)
(c) x
2
, t (7)
At 5% level of significance students t (7) value = 2.365.
Hence, the null hypothesis cannot be rejected for c and x
2
however, the null hypothesis is rejected for x
1
. Thus, the change
in expenditure on advertisement has a significant bearing on
sale of bread though the same is not the case with price.
Confidence limits at 95% level for the estimate of x
1
are =
0.8846+(0.196248) (2.365) or 1.35013 & 0.42187.
(iii) F Statistic: The F statistic for the regression model is
13.87718, whereas the F value at 1% level of
significance for 2 and 7 and degrees of freedom is
9.55. Hence, the null hypothesis is rejected and the
effect of independent variable on dependent variable
may be taken as significant.
R
2
: The coefficient of multiple determination R
2
at
0.79974 and the adjusted R
2
at 0.74252 indicate that
the variations in expenditure on advertisement and
price explain 79.97% and 74.25% of the change in
sales revenue of bread. This may be largely
considered as satisfactory.
(iv) Comparison of Actual Sales Revenue and Estimated
Sales Revenue:
Week 1 2 3 4 5 6 7 8 9 10
Actual
Revenue (Y)
12 8 20 18 14 10 4 16 8 20
Estimated
Revenue ()
12.67 11.97 18.51 17.41 15 11.55 7.92 16.67 3.27 17.32
Example 3: To identify the variable influencing the
consumption of water and to determine the quantitative
relationship existing among the consumption of water and
temperature, production and size of work force, observations
were recorded at 17 agriculture farms. These were as follows:
Farm
Water
Consumption
Gallons (y)
Average
Temperature
During the Month
(x1)
Value of
Production
(Rs. Lakh)
(x2)
Number of Persons
on Pay roll
(x3)
1 3067 58.8 7107 129
2 2828 65.2 6373 141
3 2891 70.9 6796 153
4 2994 77.4 9208 166
5 3082 79.3 14792 193
6 3502 71.9 11964 175
7 3060 63.9 13526 186
8 3211 54.5 12656 190
9 3286 39.5 14119 187
10 3125 43.6 14571 206
11 3022 56.0 13619 198
12 2922 64.7 14575 192
13 3950 73.0 14556 191
14 4488 78.9 18573 200
15 3295 79.4 15618 200
16 3898 81.9 14564 189
17 3542 44.5 16691 195
Computer Output
The computer output from the analysis of the above data was a
follows:
Variable Coefficients Standard Error t ratio
C 3858.98 110.0 3.508
x
1
8.08220 5.594 1.445
x
2
0.193041 0.5428 E01 3.557
x
3
19.6354 8.729 2.249
R
2
=0.6329
Standard Error of Estimate = 300.2
F ratio =7.472; Degrees of freedom 3 & 13.
You are required to give
(i) Give the multiple regression analysis model,
(ii) Interpret the regression coefficients,
(iii) Give observation on regression statistics, and.
(iv) Estimate the water consumption (y) if x
1
= 70; x
2
=17,000; (Rs. Lakh) and x
3
= 170
Solution:
(i) Regression Model
= 3858.98 +8.08220 x
1
+0.193041x
2
19.6354 x
3
(ii) Regression Coefficients
Students value at 5% level of significance and 13 degrees of
freedom is 2.160. Hence, regression coefficients of x
1
and x
3
are
significant while that or x
2
is significant.
Using standard error of regression, coefficients limits of
variation in the regression coefficients at 5% level of
significance are,
(a) Constant=3858.98 (2.160)(110.0) = 4096.58 and 3.621.38
(b) b
1
=8.0220 ( 2.160)(5.594) =20.1653 and 4.0009.
(c) b
2
=0.193041(2.160)(0.5428) = 1.365489 and 0.979407.
(d) b
3
= 19.6354 (2.160) (8.729) =38.4900 and 07808.
(iii) Regression Statistics
The expected value of when x
1
=70, x
2
= 17000 and
x
3
= 170; = 3858.98+(8.08220 x70) + (0.193041 x17000) +
(19.6354 x170)
=3858.98+565.754+3236.973338.018 or = 4323.6116
(iv) Regression Statistics
(a) Coefficient of Determination (R
2
): Coefficient of
determination R
2
= 0.6329 indicates that the three
independent variables explain 63.29% of the total
variation in the dependent variable.
(b) Standard Error of Estimate: At 5% level of significance the
limits of variation in the estimated value of y would be,
+(t)(SE)
4323.6116 +(2.160) (300.2) = 4972.0416 and 3675.1996.
CANONICAL CORRELATION ANALYSIS
In this technique simultaneous prediction is made to predict a
set of criterion variables from their joint covariance with a set
of explanatory variables. This technique can be successfully
employed for metric data. The procedure used is as follows:
i) Assign the weights to each dependent and independent
variables in such a way that linear composite of the
criterion variables has a maximum correlation with the
linear composite of the explanatory variables.
ii) The process of finding weights requires factor analysis
with two metrics.
iii) Mathematically, in cannonial correlation analysis, the
weights of two sets vis., a
1
a
2
+ a
3
. . . a
k
and b
1
b
2
+
b
3
. . .
b
j
+are so determined that the variables
X = a
1
x
1
+ a
2
x
2
+ a
3
x
3
. . . a
k
x
k
and
Y = b
1
y
1
+ b
2
y
2
+ b
3
y
3
. . . b
j
y
j
have a maximum
common variance.
A canonical correlation is the correlation of two
canonical (latent) variables, one representing a set of
independent variables, the other a set of dependent variables.
Each set may be considered a latent variable based on
measured indicator variables in its set. The canonical
correlation is optimized such that the linear correlation
between the two latent variables is maximized. Whereas
multiple regression is used for manytoone relationships,
canonical correlation is used for manytomany relationships.
There may be more than one such linear correlation
relating the two sets of variables, with each such
correlation representing a different dimension by which the
independent set of variables is related to the dependent set. The
purpose of canonical correlation is to explain the relation of the
two sets of variables, not to model the individual variables.
Analogous with ordinary correlation, canonical correlation
squared is the percent of variance in the dependent set
explained by the independent set of variables along a given
dimension (there may be more than one). In addition to asking
how strong the relationship is between two latent variables,
canonical correlation is useful in determining how many
dimensions are needed to account for that relationship.
Canonical correlation finds the linear combination of variables
that produces the largest correlation with the second set
of variables. This linear combination, or "root," is extracted
and the process is repeated for the residual data, with the
constraint that the second linear combination of variables must
not correlate with the first one. The process is repeated until a
successive linear combination is no longer significant.
Canonical correlation is a member of the multiple general linear
hypothesis (MLGH) family and shares many of the assumptions
of multiple regression such as linearity of relationships,
homoscedasticity (same level of relationship for the full range of
the data), interval or near interval data, untruncated
variables, proper specification of the model, lack of
high multicollinearity, and multivariate normality for purposes of
hypothesis testing.
Often in applied research, scientists encounter variables of
large dimensions and are faced with the problem of
understanding dependency structures, reduction of
dimensionalities, construction of a subset of good predictors
from the explanatory variables, etc. Canonical correlation
Analysis (CCA) provides us with a tool to attack these problems.
However, its appeal and hence its motivation seed to differ from
the theoretical statisticians to the social scientists.
The Model
Suppose there are two sets of variables Z
(x)
and Z
(y)
with p and
q variables within each set of variables, respectively. Two new
linear combination of the each set of variables, u
k
= a
k
z
(x)
and
v
k
= b
k
z
(y)
, are such that the simple correlation coefficient r
k
between the transformed variables u
k
and v
k
is maximized. The
new variables u
k
and v
k
are called the canonical variates and r
k
is called the canonical correlation coefficient between the
canonical variates u
k
and v
k
. In all, there will be s pairs of such
linear transformations, k = I, ...,s where s is the smaller of p
and q. As in the case of principal component analysis,
successive pairs of canonical variates are required to be
uncorrelated with the preceding variates.
Assumption
The assumptions that must be met for applying canonical
correlation analysis are as follows.
I. Measurement error of the variables is minimal.
2. Variance of variables are not restricted.
3. Magnitudes of the coefficients in the correlation matrix must
not be attenuated by large differences in the shapes of the
distributions for the variables.
Type of data: The variables need not be continuous or directly
measured. Nominal scaled data, which is appropriate where a
classificatory structure exists, can also be used. The use of
differently coded dummy variables, principal components of
either or both sets of observed variables, can also be made.
The Method:
The canonical model selects linear functions that have maximum
covariances between domains, subjected to the restrictions of
orthogonality, that is, the functions in the new pair of linear
functions must be uncorrelated with all previously located
functions in each new domain. The mathematics involved in the
extraction of these linear functions are similar to that of
principal component analysis. The difference lies in the matrix
that is subjected to the mathematical treatment. In principal
component analysis, the variancecovariance matrix is used
whereas in canonical analysis the correlation matrix is made use
of in obtaining linear functions.
Significance Test
Several tests are available when the researcher wishes to test
the statistical significance of a null hypothesis of no relationship
between the criterion and the predictor variable sets. The most
widely used is the Bartlett (1951) test, which tests for the
significance of canonical correlations.
Interpretation
As a thumb rule, canonical correlations of 0.30 or less are
treated as trivial. When a statistically significant canonical
correlation is identified, the researcher desires to determine the
extent to which the various variables contributed to the
identified multivariate relationship. To facilitate this, several
additional coefficients are calculated. They are structure
coefficients, communality and adequacy coefficients,
redundancy coefficients, and analysis and index coefficients.
Details of each one of this and other supplementary techniques
can be found in Thompson (1984). Even though canonical
analysis is one of the 'major methods' of multivariate analysis,
some difficulty is encountered in attempting to interpret a pair
of canonical variables, both of which are linear combinations of
the original variables.
Canonical analysis and regression analysis: Canonical
analysis is closely related to regression analysis. This involves
the regression of a vector of response variables on a vector of
predictors. However, a clear distinction can be noticed between
them. In canonical analysis, the two sets of variables are
treated symmetrically, this is not the case with multiple
regression. Another important point is that, in multiple
regression the fact of deciding a variable as a response is of
great importance. But in canonical analysis, either set of
variables can be used to predict the other, or more precisely,
both sets of variables simultaneously predict each other.
Example: For a group of 30 students studying in a certain
class, four ability variables: (i) Reading Comprehension Test
(RCT), (ii) Creativity Test (CT), (iii) Mechanical Reasoning Test.
(MRT), and (iv) Abstract Reasoning Test (ART); and three
motive measures such as: (a) Sociability Inventory (SI), (b)
Physical Science Interest Inventory (PSII), and (c) Office Work
Interest Inventory (OWII) were measured. The details of these
are given below in the following table;
RCT CT MRT ART SI PSII OWII
39 9 12 9 10 20 18
15 7 10 10 4 15 13
28 8 12 9 9 8 6
47 13 14 12 4 28 24
40 10 15 12 11 26 1
21 10 14 11 6 8 9
33 9 12 9 11 16 11
46 18 20 15 9 36 2
42 10 17 13 6 33 16
38 14 18 11 9 30 3
42 12 17 12 6 27 12
32 10 18 8 1 20 23
39 16 17 11 4 23 11
43 8 10 11 8 28 19
41 13 10 8 7 33 4
34 7 9 5 8 9 6
41 11 12 11 2 17 20
38 11 14 11 7 18 14
32 5 14 13 4 13 3
41 17 17 11 8 21 20
32 10 12 7 8 25 3
43 5 11 11 2 28 14
24 9 9 7 5 27 24
43 12 15 12 11 26 6
43 16 19 12 2 31 14
25 10 15 7 8 17 14
36 14 16 12 3 24 16
45 10 16 11 9 34 17
27 8 10 13 11 24 12
39 9 17 11 9 28 10
Study the canonical relationships between four ability variables
and three motive measures.
Solution: In SPSS, canonical correlation is included in
multivariate analysis of variance (MANOVA).Most other
softwares choose the correlation matrix as input from among
dependent and independent variables. But SPSS generates
these correlations and uses them in the analysis.
The motivating variables SI, PSII, and OWII are dependent
variables (criterion variables) and the ability variables RCT, CT,
MRT, and ART are independent (covariates in the SPSS)
variables. Within the cells, regression is performed in the
analysis of variance to extract three roots and canonical
coefficients (weights) for the three dependent variables and
displayed in Table 1 and 2. Table 1 shows the coefficients when
raw scores are used and Table 2 when the variables are
standardized (with mean = 0 and = 1). The structural
correlations of dependent variables with the canonical function
are displayed in Table 3. In order to choose the most efficient
function, the variance explained by each canonical variables is
obtained in the analysis of variance (Table 4), leading to a
choice of canonical function number 1. Similar procedure for
criterion variables yields tables 5 to 8, leading to the choice of
function 1. Again, canonical function I is chosen based on the
variance (50.406). The final result is summarized in Table 9.
Table 1: Canonical Correlation Raw Canonical
Coefficients for Dependent Variables
Table 2: Canonical Correlation Standardized Canonical
Coefficients for Dependent Variables
Table 3: Canonical Correlation Correlation between Dependent and
Canonical Variables
Table 4: Canonical Correlation Analysis of Variance
(Variance Explained by Canonical Variables of Dependent
Variables)
(i) Criterion composite (ability variables):
V
a
= 0.014xSI + 0.130xPSII  0.007xOWII
(ii) Predictor composite (motives, covariates):
V
m
= 0.096xRCT + 0.081xCT + 0.011xMRT + 0.093xART
The canonical correlation (simple correlation between the two
composites) is 0.611 when these discriminant function scores
are correlated. The structural correlations indicate that PSII is
the dominant variable in the dependent canonical function
(correlation = 0.999) and others
Table 5: Canonical Correlation Raw Canonical
Coefficients for Covariates
Table 6: Canonical Correlation of Variances (Standardized Canonical
Coefficients for Covariates)
Table 7: Canonical Correlation Correlation between
Covariates and Canonical Variables
Table 8: Canonical Correlation Analysis of Variance
(Variance Explained by Canonical Variables of the
Covariates)
Table 9: Canonical Correlation Summary of Results
FACTOR ANALYSIS
This technique is used when there is a systematic
interdependence among a set of observed variables and one
desires to find out something more fundamental of inherent
which creates this commonality. Factor analysis determines a
large set of measured variables in terms of relatively few
categories, known as factors. It helps in grouping variables into
factors (based on the correlation between variables) and their
value derived by summing the values of the original variables
which have been grouped into the factor. This technique is most
often used by behavioral scientists.
The essential purpose of factor analysis is to describe, if
possible, the covariance relationships among many variables
in terms of a few underlying but unobservable random
quantities called factors. A frequent source of confusion in the
field of factor analysis is the term factor. It sometimes refers to
a hypothetical, unobservable variable as in the phrase common
factor. In this sense, factor analysis must be distinguished
from component analysis since a component is an observable
linear combination. Factor is also used in the sense of matrix
factor, in that one matrix is a factor of second matrix if
the first matrix multiplied by its transpose equals the second
matrix. In this sense, factor analysis refers to all methods of
data analysis using matrix factors, including component analysis
and common factor analysis. A common factor is an
unobservable hypothetical variable that contributes to that
variance of at least two of the observed variables. The
unqualified term factor often refers to a common factor.
A unique factor is an unobservable hypothetical variable
that contributes to the variance of only one of the observed
variables. The model for common factor analysis posits one
unique factor for each observed variable.
Some definitions related to factor analysis is give as follows;
Factor: A factor represents the joint impact of a set of
attributes. There may be either one or more than one factor can
exit in a real life problem as per the complexity of the
circumstances and the number of variables operating.
Factor loading: Factor loading is the value, which tells about
the closeness between the variables and the related factors.
Actually it is reflects the correlation between a factor and a
variable. For drawing the inferences only the absolute value of
factor is considered.
Communality: communality tells about the quantity of
variables accountability for the factors. It is sum of squares of
the loadings of the variables on the common factors. If X , Y , Z
are the factors then communality is given as follows;
h
2
= (Factor loading of the variable with respect to factor X)
2
+
(Factor loading of the variable with respect to factor Y)
2
+
(Factor loading of the variable with respect to factor Z)
2
+ ..
Eigen value: Eigen value id the sum of the squared values of
factor loadings pertaining to a factor. It is a measure of the
comparative significance of each factor under consideration.
Factor Rotation: Rotation is done in factor analysis to obtain a
simple structure of the data or to reduce the complexity due to
larger number of variable. This constitutes geometric aspect of
factor analysis through factor rotation the axes on the graph
(where the point representing the variables have been shown )
are rotated keeping the location of these points in relation to
each other undisturbed.
Five conditions of a simple structure obtained by rotating the
axes given by LL Thurston are:
 In each vector of factor loading only a relatively small
number of variables will have high loading and the rest
will have small loading.
 Each variable will have high loading in only a small
number of factors.
 For a pair of factors, very small number of variables will
have high loading on both the factors.
Among various method of factor rotation the commonly used
method for obtaining a simple structure are, Varimax rotation
and Quartimax Rotation.
METHODS OF FACTORS ANALYSIS
There are three methods usually employed for factor analysis
viz.,
a) The centroid method
b) The principal components method
c) The maximum likelihood method
Centroid Method:
The centroid method of factor analysis was developed by L.L.
Thurston. This method was commonly used till 1950, before the
use of electronic computers became widespread. In centroid
method, the sum of loading are maximized ignoring their
mathematical signs. The liner combination of all the weights is
taken between +1 and 1.
Step by step treatment of data in centroid method is a given
below:
1. First, for the given observations standard scores are
determined. From these scores statistic such as mean,
standard deviation and coefficient of correlation between
the variables are determined. Product moment method
can be conveniently used for calculations .The coefficients
of correlation are arranged in the form of a correlation
matrix r.
2. The centroid method requires that the weights for all the
variable are positive. Thus, in case the correlation matrix
obtained is a positive manifold (i.e, ignoring the diagonal
elements if for each variable the sum of negative
correlations),the correlation matrix is used as such for
extracting the first centroid factor. But in case the
correlation matrix is not a positive manifold, then a
reflected correlation matrix is obtained.
3. From each column of correlation matrix/reflected
correlation matrix first, centroid factor is obtained as
under:
 For each column of correlation matrix sums of
coefficients including the diagonal unity are
obtained.
 From the above by adding all the figures in the row
sum of column sums (T) is obtained.
 First Centroid Factor Loading (F1) are obtained by
dividing the sums of the coefficients in the columns
by square root of sum of column sums (T).
4. Then the second centroid factor is obtained as under:
 A matrix of factor cross products (Q) is obtained
by multiplying the elements of first centroid
factor for each variable by elements of first
centroid factor for all other variable.
 From the above first, matrix of residual coefficients (R
1
) is
obtained by subtracting the factor cross products (Q
1
) from
their respective coefficients in correlation matrix (r).
 In case some of the residuals in the matrix of residual
coefficients (R
1
) are negative, a reflected matrix of the
residual coefficients (r
1
) is obtained.
 For each column of the reflected residual coefficients (r
1
)
sum is obtained. Dividing these sums by square root of
sum of column sums, second centroid (F
2
) is obtained.
5. Third, fourth and subsequent centroid factors are
extracted by following the procedures, as explained in
step (4) above.
6. After factoring, vectors of centroid factors are arranged in
a table and commonalty (h
2
) for all the variables and
eigen values for all the factors are obtained. The
commonalty (h
2
) is sum of squares of factor loading for
the variable. The eigen values are sum of squares of
loading for each factor as well as the commonalitry.
7. From eigen values proportion of total variance explained
and proportion of common variance explained by each
factor is determined. Proportion of total variance
explained is determined by dividing the eigen value of a
factor by the number of variable considered. The
proportion of common variance explained is obtained by
dividing the eigen value for the factor by the eigen value
of the commonality.
Example 1:
From the following correlation matrix (r) relating to six
variable obtain the first two centroid factors, commonality for
each variable and eigen value for each factor and comment
on the results.
variables 1 2 3 4 5 6
1 1.00 0.55 0.43 0.32 0.28 0.36
2 1.00 0.50 0.25 0.31 0.32
3 1.00 0.39 0.25 0.33
4 1.00 0.43 0.49
5 1.00 0.44
6 1.00
Solution: As the given correlation matrix (r) is a positive
manifold, the first centroid factor (A) is calculated as under:
1 2 3 4 5 6
1 1 0.55 0.43 0.32 0.28 0.36
2 0.55 1 0.5 0.25 0.31 0.32
3 0.43 0.5 1 0.39 0.25 0.33
4 0.32 0.25 0.39 1 0.43 0.49
5 0.28 0.31 0.25 0.43 1 0.44
6 0.36 0.32 0.33 0.49 0.44 1
Column
sums
2.94 2.93 2.9 2.88 2.71 2.94=17.30=T
T = 4.16
2.94 2.93 2.9 2.88 2.71 2.94
First
Centroid
4.16 4.16 4.16 4.16 4.16 4.16
Factor (A) 0.707 0.704 0.697 0.692 0.651 0.707
Matrix of Factor Cross Products
Factor
Loading
0.707 0.704 0.697 0.692 0.651 0.707
0.707 0.4998 0.4977 0.4928 0.4892 0.4603 0.4998
0.704 0.4977 0.4956 0.4907 0.4972 0.4583 0.4977
0.697 0.4928 0.4907 0.4858 0.4823 0.4537 0.4928
0.692 0.4892 0.4872 0.4823 0.4789 0.4505 0.4892
0.651 0.4603 0.4583 0.4537 0.4505 0.4238 0.4603
0.707 0.4998 0.4977 0.4928 0.4892 0.4602 0.4998
Matrix of Residual Coefficients (R
1
)
1 2 3 4 5 6
1 0.5002 0.0523 0.0628 0.1692 0.1803 0.1398
2 0.0523 0.5044 0.0093 0.2372 0.1483 0.1777
3* 0.0628 0.0093 0.5142 0.0923 0.2037 0.1628
4* 0.1692 0.2372 0.0923 0.5211 0.0205 0.0008
5* 0.1803 0.1483 0.2037 0.0205 0.5762 0.0203
6* 0.1398 0.1777 0.1628 0.0008 0.0203 0.5002
Column
sums
1.1046 1.1292 1.0451 1.0411 1.0493
1.0016
T
2
=6.4709
Second
Centroid
Factor
(B)
0.4342 0.4439

0.4108

0.4093

0.3937
=2.5438
*These variable are reflected
Determination of Commonalities (h
2
) and Eigen Values
Factor loading
Variables
Centroid
Factor A
Centroid
Factor B
Commonality (h
2
)
1 0.707 0.4342
(0.0707)2 +(0.4342)2 
0.6883
2 0.704 0.4439
(0.704 )2 +(0.4439)2 
0.6926
3 0.697 0.4108
(0.697 )2 + (0.4108)2 
0.6546
4 0.692 0.4093
(0.692 )2 + (0.4093)2 
0.6464
5 0.651 0.4518
(0.651 )2 + (0.4518)2 
0.6279
6 0.707 0.3937
(0.707 )2 + (0.3937)2 
0.6548
Eigen Values
(Common
2.8837+ 1.0809 3.964
Variance)
Proportion Of
Total Variance
Explained
0.4806+ 0.1802 0.6608 (66.3%)
Proportion Of
Common
Variance
Explained
0.7274 + 0.2726 1.0000(100%)
PRINCIPAL COMPONENTS METHOD
Principal component method of factor analysis was developed by
H.Hotelling. This makes certain improvements over the centroid
method. It is a widely used method. A Principal component is
liner combination of variables contributing the maximum to total
variance. Thus first principal component is linear combination of
variables contributing the maximum to total variances. Second
principal component unrelated with the first is liner combination
of variables contributing second maximum to residual variance,
and so on. Sum of variances of all principal components taken
together is equal to the sum of variance of all variables
considered.
In principal components method, all the factors are arranged as
per the value of their coefficients (a
ij
, i = 1, 2, 3..n) from the
highest positive to the highest negative and principal factors are
identified. Thus, k observed variables are described linearly in
terms of R new uncorrelated (orthogonal) components / factors
(F
1
, F
2
, F
3
F
p
), and combination of variables with largest
percentage variance are selected, such that first principal
component is one with the maximum variance, the second
principal component is one with next maximum variance and so
on.
Step by step treatment of data in principal and coefficient of as
under :
1. Statistics such as mean, standard deviation and coefficient of
correlation between pairs of variables are obtained from the
given set of observations.
2. The coefficients of correlation are arranged in the form of a
correlation matrix (r). In case correlation matrix is not
positive manifold, a reflected matrix of the coefficients (r) is
obtained.
3. From correlation matrix/reflected correlation matrix the first
principal component is obtained as under.
 For each column, sum of correlation coefficients including
the diagonal elements is obtained. The vector of column
sums is referred to as
1 1 2 3
1
( ........ )
k
k
i
Ua a a a a
=
+ +
 A normalization factor is obtained for normalization of
column sums (Vector.1). The normalization factor is
square root of the sum of squares of all the column sums.
12 22 32 2
( ........ )
k
Ua Ua Ua Ua + +
 The elements of normalized column sums (Va
1
) are
multiplied by their respective coefficients in various rows
of the correlation matrix/reflected correlation matrix (r or
r) one by one starting with and the first row and ending
with the last row of the matrix and the sum of these
products is put at the end of the row. The resultant vector
is referred to as Ua
2
.
For instance, a
11
will be multiplied by first element of Va
1
, a
12
by
second element of Va
2
and so on. And the sum of the resultant
products will be placed at the end of the first row. Next a
12
, will
be multiplied by first element of Va
1
, a
22
by second element of
Va
1
, a
23
by third element of Va
1
and so on and the sum of the
resultant products will be placed at the end of the second row.
This exercise will be carried out for all the rows of the matrix,
from the first row to the last row.
 A normalization factor is obtained for normalization of row
sums (Vector Ua
2
). The normalization factor is square
root of the sum of square of all elements of the vector
Ua
2
2 2 2 2
21 22 23 2
( ........ ) kUa Ua Ua Ua k + +
 The elements of the vector Ua
2
are divided by the
normalization factor. This normalized vector is referred
to as Va
2
.
 Vector Va
1
is compared with Va
2
. If the two vectors
are nearly identical, it is called convergence. In case
there is no convergence, the third vector is processed
again till convergence occurs.
 Once convergence occurs, first principal component is
extracted. For this various elements of vector
elements Va are multiplied by square root of
normalization factor or vector Ua
2
. The products
constitute the elements of First Principal Component
(F
1
).
4. For extracting second principal component , the vector of
first principal component is further processed as under :
 A matrix of factor cross products is obtained by
multiplying all the elements of the vector of First Principal
Component by its all other elements including itself one
by one. These products presented in the form of matrix
are referred to as first matrix of factor cross products
(Q
1
).
 Various elements of the matrix of factor cross products
(Q
1
) are subtracted from the respective of the matrix of
correlation coefficient (r) on e by one. The residuals
presented in the form of matrix are referred to as first
matrix of residual coefficients (R
1
).
 In case any of the residual coefficients is negative, it is
reflected and matrix of reflected residual coefficients is
obtained. This is referred to as first matrix of reflected
residual coefficients (R
1
).
5. From the first matrix of reflected residual coefficients (R
1
),
second principal component is extracted as under :
 Sum of residual coefficients for each column of the matrix
is obtained. The vector of column wise sums of residual
Coefficients is referred to, as
1 1
1
k
i
Ub Ub
=
 
=

\ .
 A normalization factor for the vector of sums of residual
coefficients is obtained. The normalization factor (NF) is
square root of the sum of squares of all the column sums.
 The column sums are normalized by dividing the elements
of the vector Ub
1
by normalization factor (NF) one by one.
The vector of the normalization column sums so obtained
is referred to as Vb
1
.
 The elements of the vector Ub
1
are multiplied by their
respective coefficients in each row of the first reflected
matrix of residual coefficients (R
1
) one by one, starting
with first row and ending with last row. The vector of sum
of products placed at the end of row is referred to as Ub
2
.
 For normalization of the elements of vector Ub
2
, a
normalization factor (NF) is obtained by extracting square
root of the sum of squares of all the elements of vector
Ub
2
. Various elements of the vector Ub
2
are divided by the
normalization factor one by one. The resultant vector is
referred to as new normalized vector Vb
2
.
 It the elements of two vectors (vectors Vb
1
and Vb
2
) are
identical, it is taken as convergence. In case there is no
convergence, the trial vectors are processed again and
again till convergence occurs.
 Second principal component (F
2
) is extracted by
multiplying the element of vector Vb by square root of the
normalization factor of Vb
2
.
6. The Method for extracting second principal component is
followed for extracting third, fourth and subsequent principal
components.
To decide how many principal components are to be retained in
a study, Kaiser has suggested that since each standardized
variable has variance of 1, any principal component with eigen
value of less than 1 is not worth consideration.
Accordingly, principal components having eigen value equal to
or greater than 1 (>1) alone are to be retained and those with
less than 1 eigen value are excluded.
7. Finally a matrix of factor loadings, commonalties of variables
considered, eigen values of factors and proportion of variance
explained by factors is obtained. The commonality (h
2
) for a
variable is sum of squares of elements of principal component
for that variable. An eigen value is sum of loading for each
principal component/factor as well as commonality.
The proportion of total variance explained is determined by
dividing eigen value of a factor by the number of variables
considered. The proportion of common variance explained is
obtained by dividing the eigen value of the factor by eigen value
of the commonality.
8. Finally, the results are interpreted variables are expressed
against the factor and coefficients are used to generate
factor scores. The factor scores are used for multivariate
analysis. Sometimes, factor rotation is done after factor
analysis to obtain a simple structure.
Example 2
From the following correlation matrix (r) of six variables
compute first two principal component factors, their eigen
values and commonality for the variables. Also determine the
proportion of total variance and common variance explained by
each of the two component factors:
Variables 1 2 3 4 5 6
1 1.000 0.704 0.707 0.428 0.357 0.862
2 0.704 1.000 0.387 0.425 0.379 0.642
3 0.707 0.387 1.000 0.579 0.342 0.862
4 0.428 0.425 0.579 1.000 0.375 0.427
5 0.357 0.379 0.342 0.375 1.000 0.643
6 0.862 0.642 0.862 0.427 0.643 1.000
Solution
Since the given correlation matrix is a positive manifold, first
principal component factor is worked out as under:
Variables 1 2 3 4 5 6 Ua
2
Normalized
Vector Va
2
= Ua
2
/NF
1 1.000 0.704 0.707 0.428 0.357 0.862 1.705 0.453
2 0.704 1.000 0.387 0.425 0.379 0.642 1.452 0.385
3 0.707 0.387 1.000 0.579 0.342 0.862 1.625 0.431
4 0.428 0.425 0.579 1.000 0.375 0.427 1.286 0.341
5 0.357 0.379 0.342 0.375 1.000 0.643 1.232 0.326
6 0.862 0.642 0.862 0.427 0.643 1.000 1.848 0.490
Column
Sums
(Ua
1
)
4.058 3.537 3.877 3.234 3.096 4.436
Normalized
vector
(Va
1
)Ua
1
/NF** =
0.443 0.387 0.424 0.353 0.338 0.485
**Normalization Factor for (Ua
1
) *
2 2 2 2 2 2
(4.058) (3.537) (3.887) (3.234) (3.096) (4.436) = + + + + +
83.73 9.15** = =
Normalization Factor for Ua
2
2 2 2 2 2 2
(1.705) (1.452) (1.625) (1.286) (1.232) (1.848) = + + + + +
14.243 3.774 = =
Comparing Va
1
and Va
2
we find that the two vectors are nearly
equal, so convergence has occurred. Hence, first principal
component is extracted.
Extraction of First Principal Component
Variables
Characteristics (Va) x Normalizing Factor of Ua
2
or 3.774 = First
Principal Component (F
1
)
1 0.443 x 1.943 = 0.861
2 0.387 x 1.945 = 0.752
[Type text] Page 451
3 0.424 x 1.943 = 0.824
4 0.353 x 1.943 = 0.686
5 0.338 x 1.943 = 0.657
6 0.485 x 1.943 = 0.942
First Matrix of Cross Products (Q
1
)
First
Principal
Component
0.861 0.752 0.824 0.686 0.657 0.942
0.861 0.741 0.647 0.709 0.591 0.566 0.811
0.752 0.647 0.566 0.620 0.516 0.494 0.708
0.824 0.709 0.620 0.679 0.565 0.541 0.776
0.686 0.591 0.516 0.565 0.471 0.451 0.646
0.657 0.566 0.494 0.541 0.451 0.432 0.619
0.942 0.811 0.708 0.776 0.646 0.619 0.887
First Matrix of Residual Coefficients (R
1
)
Variables 1 2 3 4 5 6
1 0.259 0.057 0.002 0.163 0.209 0.051
2 0.057 0.434 0.233 0.091

0.115
0.066
3 0.002 0.233 0.321 0.014 0.199 0.086
4 0.163 0.091 0.014 0.529 0.076 0.219
5 0.209 0.115 0.199 0.076 0.568 0.024
6 0.051 0.066 0.086 0.219 0.024 0.113
Reflected Matrix of Residual Coefficients (R1) and
Extraction of Second Principal Component
1 2* 3* 4* 5* 6* Ub2
Normalized
Factor Vb
2
=Ub
2
/NF@
0.259
0.05
7
0.00
2
0.16
3
0.20
9
0.05
1
0.30
9
0.322
*0.057
0.43
4
0.23
3
0.09
1
0.11
5
0.06
6
0.41
6
0.433
*0.002
0.23
3
0.32
1
0.01
4
0.19
9
0.08
6
0.35
5
0.369
*0.163
0.09
1
0.01
4
0.52
9
0.07
6
0.21
9
0.44
5
0.463
*0.209
0.11
5
0.19
9
0.07
6
0.56
8
0.02
4
0.53
1
0.553
0.051
0.06
6
0.08
6
0.21
9
0.02
4
0.11
3
0.22
4
0.233
Column
Sums
(Ub
1
)
0.74
1
0.99
6
0.85
5
1.09
2
1.19
1
0.55
9
Normaliz
ed Vector
Vb
1
=Ub
1
/
NF**
0.32
5

0.43
7

0.37
5

0.47
9

0.52
3
0.24
5
** Normalization Factor (Ub
1
)
2 2 2 2 2 2
(0.741) (0.996) (0.855) (1.092) (1.191) (0.559) = + + + + +
5.195 2.279 = =
@N.F.(Ub
2
)
2 2 2 2 2 2
(0.309) (0.416) (0.355) (0.445) (0.531) (0.224) = + + + + +
0.924 0.961 = =
Note: These variables were reflected.
As vectors Vb
1
and Vb
2
are nearly equal, convergence has
occurred. Hence, Vb1 is taken as characteristics vector Vb.
Extraction of Second Principal Component
Variables
Characteristics (Vb) x Normalization vector
( 1 0.961) Vb or = Second Principal Component (F
2
)
1 0.325 x 0.98 = 0.319
2 0.437 x 0.98 = 0.428
3 0.375 x 0.98 = 0.368
4 0.479 x 0.98 = 0.469
5 0.523 x 0.98 = 0.513
6 0.245 x 0.98 = 0.240
Matrix of Factor Loading, Commonality and Eigen Values
Variance Factor Loading Commomality (h
2
)
[Type text] Page 454
First
Principal
Component
Second
Principal
Component
1 0.861 0.319
(0.861)
2
+(0.319)
2
= 0.843
2 0.752 0.428
(0.752)
2
+(0.428)
2
= 0.749
3 0.824 0.368
(0.824)
2
+(0.368)
2
=0.814
4 0.686  0.469
(0.686)
2
+(0.469)
2
=0.691
5 0.657 0.513
(0.657)
2
+(0.513)
2
=0.714
6 0.942 0.240
(0.942)
2
+(0.240)
2
= 0.945
Eigen Value
3.775 + 0.981 =
4.75
Proportion of
Total Variance
0.629 + 0.164 =
0.793
Proportion of
Common
Variance
0.794 +0.206 =
1.000
Example 3
A market research was undertaken to determine the choice of
car buyers. Responses of 300 potential buyers were obtained on
a 10 point scale to the following six statements:
 The car should be built to last a long time.
 The car should be fuel efficient to give at least 15 kilometers
per liter.
 The car should be easy to maintain and serviced by the
owner.
 The car should have space to seat 4 adult persons
comfortably.
 The car should have leg and head space for all the riders.
 The car should have good brakes, a critical part.
 The results of the analysis are as follows:
Stateme
nt
Factor loading
Commonality
(h
2
)
F
1
F
2
F
3
(F
1
)
2
+(F
2
)
2
+(F
3
)
2
1
0.86(0.86
)
2
0.12(0.12
)
2
0.04(0.04
)
2
= 0.76
2
0.84(0.84
)
2
0.18(0.18
)
2
0.10
(0.10)
2
= 0.75
3
0.68(0.68
)
2
0.24(0.24
)
2
0.15(0.15
)
2
= 0.54
4
0.10(0.10
)
2
0.92(0.92
)
2
0.05(0.05
)
2
= 0.86
5
0.06(0.06
)
2
0.94(0.94
)
2
0.08(0.08
)
2
= 0.89
6
0.12(0.12
)
2
0.14(0.14
)
2
0.89(0.89
)
2
= 0.83
Eigen
Values
1.94 1.85 0.84 4.63
Proportio
n of
Statemen
ts
Explained
0.32 0.31 0.14 0.77
Interpret the results as identify the automobile characteristics
important to consumers.
Solution:
Factor Loading: It is observed that F
1
is a good fit on data for
statements 1, 2, or 3 but a poor fit on the other three
statements. Thus, statements 1, 2, 3 perhaps measure some
basis attitude or requirement of the users, one likely being
economy in operation. This consideration is super most in the
mind of potential car buyers.
Factor F
2
is good fit on statement 4and 5, but poor fit on the
other four statements. It measured something not measured by
statement 1, 2, 3 and 6. One likely view is adequate and
comfortable space inside the vehicle.
Similarly F
3
is good fit on statement 6 and measured something
different from statements 1, 2, 3 4 and 5. It perhaps,
emphasized upon the aspect of safety. However since there are
only 2 and 1 statements associated with F
2
and F
3
. Obviously
the basis cannot be considered strong for the conclusion.
Commonality: Commonality (h
2
) indicated proportion of the
variance in responses to statements explained by the factors.
Here, three factors explained 75% or more variance associated
with statements 1, 2, 4, 5 and 6 thought only 50% of
statements 3. Thus the tree factors fit the data quite well.
Eigen Value: Eigen values indicated how well each of the
indentified factors fit the data from all the respondents on all
the statements. It is significant, for the first two factors being
greater than 1(>1).
Proportion of Variance Explained: Eigen values divided by the
number of statements indicated the proportion of variance in
standardized response scores explained by the factor. Here
three factors individually explained 32.26%, 30.91% and
13.91% Variance and in total 77.07% variances in the data. As
the factors explained more than 77% variance it is considered
good fit for the data.
MAXIMUM LIKELIHOOD METHOD
The maximum likelihood method factor analysis estimates
maximum likely values of population correlation coefficients
(r, p) from simple correlation coefficients (r, s). In this process
initially, first factor is extracted. From the vector of first factor a
matrix of residual coefficients is obtained. Significance test is
applied to decide whether second factors should be extracted or
not. Similar significance test is applied before extracting the
third, fourth and subsequent factors. Factoring is stopped in
case the significance test fails to reject the null hypothesis.
In maximum likelihood method, interpretation of factor loading
and other values is done in the same way as in centroid method
and principal components method.
The mathematical exercise in processing data in maximum
likelihood method requires use of calculus, higher algebra,
matrix algebra, etc. This limits the use of the method in many
situations.
MULTIVARIATE ANALYSIS OF VARIANCE
This technique is used when several metric dependable variables
are involved in a study along with many nonmetric explanatory
variables. For example, a market researcher desires to know the
effect of a test marketing advertising campaign on sales,
awareness, knowledge and attitudes of customers. Then he
should adopt the multivariate analysis of variance for achieving
the desired objectives.
Multivariate analysis of variance is used first to investigate
whether the populations mean vectors are the same and, if
not, which mean components differ significantly. MANOVA is
carried out under the following two assumptions:
1. Dispersion matrices of various populations are same.
2. Each population is multivariate normal.
Research Questions
The main objective in using MANOVA is to determine if the
response, are altered by the observers manipulation of the
independent variables. Therefore, there are several types of
research questions that may be answered by using MANOVA:
1) What are the main effects of the independent variables?
2) What are the interactions among the independent variables?
3) What is the importance of the dependent variables?
4) What is the strength of association between dependent
variables?
5) What are the effects of covariates? How may they be
utilized?
Advantages of MANOVA
Tests the effects of several independent variables and several
outcome (dependent) variables within a single analysis
Has the power of convergence (no single operationally defined
dependent variable is likely to capture perfectly the conceptual
variable of interest)
Independent variables of interest are likely to affect a number of
different conceptual variables, for example: an organisation's
nonsmoking policy will affect satisfaction, production,
absenteeism, health insurance claims, etc
can provide a more powerful test of significance than available
when using univariate tests
Reduced error rate compared with performing a series of
univariate tests
Interpretive advantages over a series of univariate ANOVAs
Limitations
Outliers: Like ANOVA, MANOVA is extremely sensitive to
outliers. Outliers may produce either a Type I or Type II error
and give no indication as to which type of error is occurring in
the analysis. There are several programs available to test for
univariate and multivariate outliers.
Multicollinearity and Singularity: When there is high
correlation between dependent variables, one dependent
variable becomes a nearlinear combination of the other
dependent variables. Under such circumstances, it would
become statistically redundant and suspect to include both
combinations.
Cluster Analysis
Cluster analysis methods are judgmental method. In this
method variables are classified into clusters. In technical terms
a cluster consists of variables that correlate highly with one
another and have comparatively low correlation with variables in
other clusters. Various groups to be determined in cluster
analysis are not predefined. This technique is employed for
defining segments of market of a product on the basis of several
characteristics of the customers such as demographics, socio
economic considerations, psychological factors, purchasing
power and similarly others.
The basic aim of the cluster analysis is to find natural or real
groupings, if any, of a set of individuals (or objects or points or
units or whatever). This set of individuals may form a complete
population or be a sample from a larger population. More
formally, cluster analysis aims to allocate a set of individuals to
a set of mutually exclusive, exhaustive groups such that
individuals within a group are similar to one another while
individuals in different groups are dissimilar. This set of groups
is called partition or dissection. Cluster analysis can also be
used for summarizing the data rather than finding natural
or real groupings. Grouping or clustering is distinct from
the classification methods in the sense that the
classification pertains to a known number of groups, and
the operational objective is to assign new observations to
one of these groups. Cluster analysis is a more primitive
technique in that no assumptions are made concerning the
number of groups or the group structure. Grouping is done on
the basis of similarities or distances (dissimilarities).
ILLUSTRATION ON CLUSTER ANALYSIS
A study conducted by the author for finding the branding
preferences of Coca cola brands among the shopping mail and
supermarket in store customers employed cluster analysis.
The initial analysis revealed that customers preference can be
grouped as meaningful three clusters by employing multivariate
statistical data analysis and any statistical software like SPSS
/Statistica/SAS. In this approach, we adopted a method in which
a cluster center initially is selected, and all objects within a
prespecified threshold distance are included in that cluster.
Table given below shows the existence of 3 clusters in the
sample. Chosen for the analysis;
Final Cluster Centers
Component
1 2 3
Coke
Sprite
Fanta
Limca
Maaza
Sunfil
3.75
4.25
2.00
5.00
1.25
4.75
2.42
1.52
2.42
4.67
4.08
5.93
.00
.00
.00
.00
.00
4.00
In order to group objects together, some kind of similarity or
dissimilarity measure is plotted against their preferences
towards various brands. Similar groups are clubbed together
based on their choice of their preferred brands. The most
popular distance measure is the Euclidean distance.
d2h=S (XimXjm)2; where Xim and Xjm represent the
standardized (to mean zero and unit standard deviation) values
of the mth attribute for group i and j and d2h the Euclidean
distance.
Distances between Final Clusters
Cluster 1 2 3
1
2
3
4.349
7.953
4.349
7.495
7.953
7.495
The most important assumption made here by the authors is
that the basic measure of similarity on which the clustering is
based is a valid measure of the similarity between the choices of
the target audience. A second major assumption is that there is
theoretical justification for structuring the purchasing
preferences towards various brands of cocacola is traced and
plotted into clusters.
Clusters were identified by following ward method. There groups
were clearly identified during the data analysis. Preliminary
analysis shows the preferences of soft drinks for the sample can
clubbed into 3 groups
Group 1 prefers COKE, SPRITE and FANTA (carbonated) N=45
Group 2 prefers LIMCA and MAAZA (fruit based) N=43
Group 3 prefers SUNFIL (concentrates) N = 12
Cluster analysis:*Ward method
Ward
Method
Coke
Sprite
Fanta
Limca
Maaz
a
Sunfil
1 Mean
N
Std.
Deviatio
n
3.133
3
45
.4045
2
1.6444
45
1.0035
3
1.644
4
45
.5289
6
5.022
2
45
.8634
0
3.733
3
45
.8634
0
5.8222
45
.64979
2 Mean
N
Std.
Deviatio
n
1.790
7
43
.6746
5
1.6512
43
.75226
3.186
0
43
.7321
1
4.325
6
43
1.322
35
4.186
0
43
.7321
1
5.9302
43
.33773
3 Mean
N
Std.
Deviatio
n
.0000
12
.0000
0
.0000
12
.00000
.0000
12
.0000
0
.0000
12
.0000
0
.0000
12
.0000
0
4.0000
12
2.9554
20
Tota
l
Mean
N
Std.
Deviatio
n
2.180
0
100
1.149
26
1.4500
100
.98857
2.110
0
100
1.221
81
4.120
0
100
1.805
04
3.480
0
100
1.507
42
5.6500
100
1.2583
1
By plotting the mean value we can further analyses these three
groups in terms of age and income. These analyses will help the
company to fine tune their promotions in future. As with other
multivariate statistical techniques, the author resorted to
statistical analysis while interpreting clusters. It is also observed
that clusters are significantly different from one another as
shown in the ANOVA.
ANOVA
Cluster
Error
Mean
Square
df Mean
square
df F Sig
Coke 35.797 2 .610 97 58.686 .000
Sprite 28.524 2 .409 97 69.689 .000
Fanta 30.687 2 .891 97 34.445 .000
Limca 115.947 2 .935 97 124.445 .000
Maaza 97.897 2 .301 97 325.576 .000
Sunfil 21.214 2 1.179 97 18.000 .000
Classification based on age
Then the author analyzed the clusters based on critical and
demographic variable to find out how significant they are by
using ANOVA. A popular parametric statistical test.
ANOVA
Sum of
Squares
df Mean
Square
F
Sig
Age Between
Groups
within
Groups
Total
6.076
71.714
77.790
2
97
99
3.038
.739
4.109 .019
Income Between
Groups
Within
Groups
55.776
236.414
292.190
2
97
99
27.888
2.437
11.442 .000
Total
Quantity
Purchased
Between
Groups
within
Groups
Total
10.994
363.756
374.750
2
97
99
5.497
3.750
1.466 .236
Frequency
of Visit
Between
Groups
within
Groups
Total
2.402
16.833
19.240
2
97
99
1.201
.174
6.918 .002
Frequency
of
purchase
Between
Groups
within
Groups
Total
13.324
122.786
136.110
2
97
99
6.662
1.266
5.263 .007
For applying multiple comparisons using Tamhane; the possible
relationship that exists among in store customers chosen for
the study were explored for tracking the effective management
of various sales promotions in relation to their patronage. The
table value below 0.05 shown in the multiple comparison charts
is considered significant.
Dependent
Variable
(i) Ward
Metho
d
(j)
Ward
Method
Mean
Differences
(IJ)
Std.
Error
Sig.
95% Confidence
Interval
Lower
Bound
Upper
Bound
Age
1 2
3
.2134
.5889
.18545
.26077
.583
.106
.6652
.0983
.2384
1.2760
2 1
3
.2134
.8023*
.18545
.26916
.583
.002
.2384
.1007
.6652
1.5040
3 1
2
.5889
.8023*
.26077
.26916
.106
.002

.1.2760
.8064
.0983
.1007
Income
1 2
3
.1984
2.1833*
.33194
.51764
.910
002

1.0071
.8064
.6102
3.5602
2 1
3
.1984
2.3818*
.33194
.52949
.910
.001
.6102
.9859
1.0071
3.7777
3 1
2
2.1833*
2.3818*
.51764
.52949
.002
.001

3.5602

3.7777
.8064
.9859
Quantity
Purchased
1 2
3
6972
.5111
.41028
65105
.254
.827

1.6963

2.2294
.3020
1.2072
2 1
3
.6972
.1860
.41028
.64120
.254
.989
.3020

1.5166
1.6963
1.8887
3 1
2
.5111
.1860
.65105
.64120
.827
.989

1.2072

1.8887
2.2294
1.5166
Frequency
Of visits
1 2
3
.2594*
.4222*
.09375
.07446
.021
.000
.4880
.6070
.0309
.2374
2 1
3
.2594*
.1628*
.09375
.05696
.021
.020
.0309
.3044
.4880
.0211
3 1
2
.4222*
.1628*
.07446
.05696
.000
.020
.2374
.0211
.6070
.3044
Frequency
Of
Purchase
1 2
3
.2977
1.1833*
.25039
.25281
.557
.000
.9078

1.8243
3124
.5424
2 1
3
.2977
.8857*
.25039
.27070
.557
.007
.3124

1.5652
.9078
.2061
3 1
2
1.1833*
.8857*
.25281
.27070
.000
.007
.5424
.2061
1.8243
1.5652
*The mean difference is significant at the 0.05 level.
Above analysis shows a significant relation exists among the
clusters in respect of their income, frequency of visits and
purchase and, to a lesser extent, age group. Quantity purchased
is insignificant and cannot explain the cluster preferences.
Source: Adapted from a research paper titled, A study on MicroProfiling of
Cola customers in the shopping mall, by Dr. S. Shajahan, published in ICFAI
journal of Marketing Management, Hyderabad in August 2004.
Multidimensional Scaling
This technique helps in measuring an item in more than one
dimension at a time. With the help of this technique, we can
scale the objects, individuals or both with a minimum of
information. According to Robert Ferber, it enables the
investigator to study. The perceptual structure of a set of
stimuli and the cognitive process underlying the development of
the structure. The process of multi dimensional scaling calls for
rank ordering each pair of objects in terms of similarity. Then
the ordered similarities are transformed into distances through
analytic computations and are consequently shown in n
dimensional space in a way that the interpoint distances, depict
the original interpoint proximities. After this sort of mapping is
done, the dimensions are usually interpreted and labeled by the
investigator.
Basic Concepts
MDS uses proximities among different objects as inputs.
Proximity is a value that denotes how similar or different two
objects are, or are perceived to be, or any measure of this type.
MDS then uses these proximities data to produce a geometric
configuration of points (objects), in a twodimensional
(preferably) space as output. Attributebased data such as
objects' X attributes (profile matrix) and nonattributebased
data, including similarity and preference data, can be used to
obtain proximities data. The Euclidean distances (derived)
between objects in the twodimensional space are then
computed and compared with the proximities data. A key
concept of MDS is that the derived distances (output) between
the objects should correspond to the proximities (input). If we
make the rank order of derived distances between objects/
brands correspond to the rank order of the proximities data, the
process is known as nonmetric MDS. On the contrary, if the
derived distances are either multiple or linear functions of the
proximities, then it is known as metric MDS. Nonmetric MDS
assumes that proximities data are ordinal, but metric MDS
assumes that they are metric. However, in both cases the
output (derived distances) is metric.
Generally, two methods are employed for MDS. They are:
attributebased and similarity/dissimilaritybased. The attribute
based approach is similar to what we have described in cluster
analysis except that these input data are then further analyzed
using either factor analysis or discriminate analysis. The second
approach, the similarity/dissimilaritybased approach, is very
easy to understand intuitively, and quite useful in gaining a
good understanding of consumer psyche. In this approach, we
need some kind of a distance measure between the brands
rated. The distance measure being input could be a simple
ranking of distances between a brand and all other brands by a
customer.
One way of doing this is to provide a customer (respondent)
with cards, each containing a pair of brands written on it, and
asking him to write down a number indicating the difference
between the two brands on any numerical scale that can
represent distance. This can be repeated for all pairs of brands
included in the research. No attributes according to which the
customer is asked to decide on the difference are specified. It
may be assumed that a consumer would tend to include
parameters such as price, quality of the product, aftersales
service, delivery time, his satisfaction with promised benefits,
perceptions about corporate image and so on, but he would not
specify these. He would only indicate distance (or dissimilarity)
in some numerical value.
Areas of Application
The most common and useful marketing application of multi
dimensional scaling is in product positioning. Positioning is
essentially concerned with mapping a consumer's mind and
placing all competing brands in a product category in
appropriate slots or 'positions' on it. One obvious way to do that
is to ask customers what they think of competing brands on,
say, six attributes with a rating scale of 5 to 10 points. This
would result in ratings for all brands on all attributes, which
could be taken two attributes at a time, and plotted on a graph.
ILLUSTRATION 1
Assuming that we have decided to use the threedimensional
solution for interpretation in this case, the next task now would
be to name the dimensions. For doing so, our previous
knowledge of the brands may become important. For example,
let us assume that the eight brands of soft Drinks / mineral
water from CocaCola and Pepsi are as follows:
1. Pepsi
2. Coke
3. Maaza
4. Sprite
5. Fanta
6. Limca
7. Aquafina
8. Kinley
We must look at the qualities of various attributes offered by
these brands through our judgement or knowledge of the
market, through a survey of consumers, or by using a
combination of these methods. This process of interpretation
tends to be subjective, regardless of the method used. For
example, we could look at the output of multidimensional
scaling, and the scores for the eight brands on the three
dimensions, and decide on the following names for the three
dimensions:
Dimension 1: Taste
Dimension 2: Celebrity endorsement
Dimension 3: Value for money
We could then look at the scores on the three dimensions and
conclude that some brands like Pepsi and Coke currently enjoy a good
brand image and taste, but brands like Sunfil, Aquafina and Kinley
may score in dimension 3. Once a particular number of dimensions are
chosen, the researcher looks at the possible interpretations for the
dimensions, and names the dimensions. At this stage, the process is
subjective, and a good judgement will produce a useful interpretation.
Similar brands can be seen close to each other on a map if the solution
chosen is a twodimensional one. Multidimensional scaling provides
an unbiased view of consumer perceptions regarding similarities
between brands or objects. Sometimes this can provide startling new
insights for marketing managers and may lead to strategic changes in
brand positioning or communications.
Selected MDS Programs:
A number of computer programs are available for conducting multi
dimensional scaling (MDS) analyses. These programs provide for a
variety of types of input data. Mainframe and microcomputer versions
of these programs are available. The following are the ones most
widely used for marketing applications.
MDPREF is designed to do MDS of preference or evaluation data. It is
a metric model based on a principal component analysis. Input data
usually are stimuli evaluation data, although paired comparisons can
be used in older versions of the model.
MDSCAL 5M constructs a configuration of points In space from
information about the distances between points. Input data are
proximities (similarities) of stimuli. Nonmetric and metric
scaling can be performed, as a nonmetric and metric unfolding.
INDSCAL performs a canonical decomposition of Nway tables
and analysis of individual differences in multidimensional
scaling. Proximity data are input and the program produces up
to a sevenway solution for 10 dimensions.
PREFMAP produces preference mapping analysis based on
generalization of the Coombsian unfolding model of preference.
The program relates preference data to a multidimensional
solution. Given a stimulus configuration and a set of preference
scales, the procedure finds for each individual an ideal point in
the given stimulus space.
PROFIT is a technique for fitting outside property vectors into
stimulus spaces. Input data are the coordinates of stimulus
points in kdimensional space derived from an MDS procedure
and sets of independently determined physical measures
(properties).
KYST represents a blending of MDSCAL 5M and TORSCA 9. It
includes the initial configuration procedure from TORSCA and
has the capability of rotating solutions to principal components.
The program handles metric and nonmetric scaling and
unfolding and uses proximity input data.
Evaluating the MDS Solution
The fit between the derived distances and the proximities in
each dimension is evaluated through a measure called stress. In
MDS, though the objects can be projected onto two, three, four
or even higher dimensions, we always prefer lower dimensions.
Usually, the stress value increases when we decrease the
number of dimensions. The appropriate number of dimensions
required to locate the objects in space can be obtained by
plotting the stress values against the number of dimensions. As
with factor analysis (screen plot) and cluster analysis (error
sums of squares plot), one chooses the appropriate number of
dimensions depending on where the sudden jump in stress
starts to occur. Sometimes we directly seek a two dimensional
representation, since managers always prefer that because it is
easier to interpret.
Issues in MDS
Perceptual maps are good vehicles to summaries the position of
brands and people in attribute space and, more generally, to
portray the relationship among any variables or constructs. It is
particularly useful to portray the positioning of existing or new
brands and the relationship of those positions to the relevant
segments. There is a set of problems and issues in working with
MDS:
(1) When more than two or three dimensions are needed, the
usefulness is reduced.
(2) Perceptual mapping has not been shown to be reliable
across different methods. Users rarely take the trouble to apply
multiple approaches to a context to ensure that a map is not
methodspecific.
(3) Perceptual maps are static snapshots at a point of time. It is
difficult from the model to know how they might be affected by
market events.
(4) The interpretation of dimensions can be difficult. Even when
a dimension is clear, it can involve several attributes, and thus
the implications for action can be ambiguous.
(5) Maps usually are based on groups that are aggregated with
respect to their familiarity with products, their usage level and
their attitude. The analysis can, of course, be done with sub
groups created by grouping people according to their
preferences or perceptions, but with a procedure that is ad hoc,
at best.
(6) There has been little study of whether a change in the
perception of a brand, as reflected by a perceptual map, will
affect choice.
Summary of MDS
Application: MDS is used to identify dimensions by which
objects are perceived or evaluated, to position the objects with
respect to those dimensions, and to make positioning decisions
for new and old products.
Inputs: Attributebased data involve respondents rating the
objects with respect to specified attributes. Similaritybased
data involve a rank order of betweenobject similarity that can
be based on several methods of obtaining similarity information
from respondents. Preference data also can provide the basis for
similarity measures and generate perceptual maps from quite a
different perspective. Ideal points or directions are based either
on having respondents conceptualize their object, or by
generating rankorder preference data and using the data in a
second stage of analysis to identify ideal points or directions.
Output: The output will provide the location of each object on a
limited number of dimensions. The number of dimensions is
selected on the basis of a goodnessoffit measure (such as the
percentage of variance in factor analysis) and on the basis of
the interpretability of the dimensions. In attitudebased MDS,
attribute vectors may be included to help interpret the
dimensions. Ideal points or directions may be an output in some
programs.
Key assumptions: The output will provide the location of each
object on a limited number of dimensions. The number of
dimensions is selected on the basis of a goodnessoffit measure
(such as the percentage of variance in factor analysis) and on
the basis of the interpretability of the dimensions. In attitude
based MDS, attribute vectors may be included to help interpret
the dimensions. Ideal points or directions may be an output in
some programs. The overriding assumption is that the
underlying data represent valid measures. Thus, we assume
that respondents can compare objects with respect to similarity
or preference of attributes. The meaning of the input data is
generally straightforward; however the ability and motivation of
respondents Lo provide it often is questionable. A related
assumption is that the respondents use an appropriate context.
Some could base a rankorder preference of beer on the
assumption that it was to be served to guests. Others might
assume the beer was to be consumed personally. With
attributebased data, it is assumed that the attribute list is
relevant and complete. If individuals are grouped, it is assumed
that their perceptions are similar. The ideal object introduces
additional conceptual problems. Another basic assumption is
that the interpoint distances generated by a perceptual map
have conceptual meaning that is relevant to choice decisions.
Limitations of MDS
A limitation of attributebased methods is that the attributes
have to be generated. The analyst has the burden of making
sure that the attributes represent the respondents' perceptions
and evaluations. With similarity and preference data, this task is
eliminated. However, the analyst then must interpret
dimensions without the aid of such attributes, although attribute
data could be generated independently and attributedimension
correlations still obtained.
Multiple Discriminant Analysis
Discriminant analysis is considered an important technique of
classifying individuals or objects into one of two or more mutually
exclusive and exhaustive groups on the basis of a set of independent
variables. The discriminant analysis provides a predictive equation.
This equation measures the relative importance of each variable. Good
multiple discriminant analysis is also a measure of the ability of the
equation to predict actual class group (two or more) concerning the
dependent variable.
Latent Structure Analysis
Latent structure analysis* is a multivariate statistical analysis
technique which is used in factor, cluster and regression techniques.
Latent structure analysis is a technique where constructs are created
from the number of other unobserved variables and these constructs
are further used for regression analysis. Latent structure analysis is
generally used to classify the case into latent classes. Latent class
analysis supports nominal, ordinal and continuous data. Structural
Equation Modeling (SEM) is the foremost type of latent class analysis.
The latent Structure analysis is used when the variables involved in a
study do not possess dependency relationships and happen to be non
metric. In this method, latent factors are extracted and relationships
of observed variables are expressed as their indicators to classify a
population of respondents into different types.
(*The terms latent class analysis and latent structure analysis are used
interchangeably within this section)
APPLICATION AREA OF LSA
A large number of applications of latent structure analysis have
appeared since late nineties. It is applied and found suitable to
understand for many diverse areas like psychiatry, medicine,
marketing, public opinion, management, planning, social issues etc.
Assumptions in latent structure analysis:
1. Nonparametric: LSA is a nonparametric test hence it doesnt
assume any assumptions related to linearity, normal distribution or
homogeneity.
2. Data level: In LSA, the data level should be categorical or ordinal.
3. Identified model: LSA models should be exactly identified or over
identified and the number of equations in the LSA must be greater
than the number of the estimated parameter.
4. Conditional independence: In case of LSA the observations should
be independent in each class.
Basic concepts of latent structure analysis:
Latent classes: Latent classes are those observed variables which are
derived from the unobserved variables. Latent classes divide the cases
into their dimensions related to the variable. For example, in LSA,
cluster analysis groups similar cases and puts them into one group.
The numbers of clusters in the cluster analysis are called the latent
classes. In Structure equation modeling, the number of constructs is
called the latent structure.
Models in latent structure analysis: Generally the maximum
likelihood method is used to calculate the probability that a case will
fall in a particular latent structure.
Number of latent structure: There are two different methods
available for determining the number of latent structure. The first
method is iterative goodness of fit method. In this method, if we add
one more class in the data, the goodness of fit model will increase,
using chisquare statistics and the second one is Bootstrapping
method.
Classifications of cases: Generally Bayes approach is applied for
classification of the cases.
Latent class cluster analysis: It is a different form of the traditional
cluster analysis algorithms. The old cluster analysis algorithms were
based on the nearest distance, but latent class cluster analysis is
based on the probability of classifying the cases.
Latent class factor analysis: It is a different from of the traditional
factor analysis. Traditional factor analysis was based on the rotated
factor matrix. In latent class factor analysis, the factor is based on the
class. One class shows one factor.
Latent class regression analysis: In latent class regression
analysis, one set of items is used as in all latent class analysis, to
establish class memberships, and then additional covariates are used
to model the variation in class memberships.
Example: Table 1 (Table 2 from Lazarsfeld, 1950b) shows a typical
dataset for a latent structure analysis. Four dichotomous items provide
an empirical breakdown of 1000 respondentsoldiers into 16
categories. The responses have been coded so that + indicates a
positive feeling about the Army,  a negative one.
TABLE 1 Manifest Data of Four Items on Attitude toward the Army
In general how do
you feel the Army
is run?
Do you think when
you are
discharged you
will [have] a
favorable attitude
toward the Army?
In general do you
feel you yourself
have gotten a
square deal from
the Army?
Do you feel that
the Army is trying
its best to look out
for the welfare of
enlisted men?
Count
+ + + + 75
+ + +  69
+ +  + 55
+  + + 42
 + + + 3
+ +   96
+  +  60
+   + 45
 + +  16
 +  + 8
  + + 10
+    199
 +   52
  +  25
   + 16
    229
A straightforward descriptive analysis of these data shows that
negative responses are more numerous except on item 1; and
that there is a positive association between each pair of items. A
soldier who responds positively to any one item is more likely to
respond positively to a second item. Lazarsfeld's analysis is
based on the assumption that each soldier can be thought of as
belong to one of two latent classes. The probability of positive
response to an item is different in one group than in the other.
Most importantly, he is willing to assume that for an individual
respondent the responses to items are statistically independent.
That's the essence of a latent class model.
From an interpretive point of view, note the implications. The
items appear to be correlated because the population is
heterogeneous. If only one "class" had been interviewed no
correlations would be observed. Lazarsfeld coined the
term "local independence" to describe this condition.
In the two chapters in Measurement and Prediction Lazarsfeld
showed how some latent structure models might be defined,
and then showed how the parameters of those models could be
estimated from the manifest data. The latent structure of this
example is summarized in Table 2 (Table 6 of Lazarsfeld,
1950b).
TABLE 2 Computed Latent Structure for Attitude toward the Army
Latent Class
Frequencies
Item 1
Probability +
Item 2
Probability +
Item 3
Probability +
Item 4
Probability +
424.3 .9240 .6276 .5704 .5125
575.7 .4324 .1871 .1008 .0635
The results tell us that the population is divided roughly
40%/60% between those who are generally favorable to the
army and those who are generally negative. Almost everyone
(92%) who belongs to the former class will answer positively to
the question "In general how do you feel the Army is run" while
almost everyone (94%) in the latter class will respond
negatively to the fourth item, "looking out for the welfare of
enlisted men." It is possible to calculate for each of the 16
manifest response patterns in Table 1 the probability that that
response came from a Class 1 respondent. This "posterior
probability" can then be used as a numerical scale, a way of
ordering the 16 response patterns, and ultimately as a
characteristic of the respondent himself in subsequent analyses.
Of course this interpretation of the numbers in Table 2 is
predicated on the assumption that the mathematical model
accurately describes the behavior of soldiers answering
questions about their feelings.
SUMMARY
In this Chapter the definition of multivariate analysis have been
explained. The advantages, limitations, scope of the multivariate
analysis have been discussed. A brief resume of multivariate
techniques have been provided.
REVIEW QUESTIONS
1. What do you understand by multivariate techniques?
Explain their significance.
2. Name the important multivariate analysis techniques and
explain the important characteristics of each techniques.
3. Write short notes on:
a) Multidimensional Scaling
b) Cluster Analysis
c) Factor Analysis
d) Multiple Regression Analysis
e) Multiple Discriminent Analysis.
4. What is factor analysis? What are its essential features?
Give important applications of factor analysis.
5. Explain the following terms used in factor analysis:
a) Factors
b) Factors loading
c) Commonality
d) Eigen value
e) Factor Rotation
f) Factor Score
6. Ina factor analysis, loading on statements 1 to 6 on
factors F
1
are 0.80,0.80,0.30,0.20,0.10,0.10; on factor F
2
are 0.10,0.20,0.30,0.30,0.90,0.10 and on factor F
3
are
0.10,0.10,0.20,0.20,0.10,0.70 respectively. Calculate
commonalities and eigen values and interpret the
resulting table.
7. Extract first two factors form the following correlation
matrix by centroid method, determine the eigen values
and commonality and comment on the results:
V
1
V
2
V
3
V
4
V
5
V
6
V
1
1.00 0.55 0.43 0.32 0.28 0.36
V
2
 1.00 0.50 0.25 0.31 0.32
V
3
  1.00 0.39 0.25 0.33
V
4
   1.00 0.43 0.49
V
5
    1.00 0.44
V
6
     1.00
8. What is cluster analysis? Describe some typical
application of cluster analysis in theoretical as well as
applied research.
9. From the following matrix of correlation coefficient
between five variables A, B, C, D and E, find clusters
using Johnsons Hierarchical scheme. State the
assumptions, if any.
Variables A B C D E
A 1.00 0.45 0.36 0.41 0.82
B  1.00 0.24 0.71 0.41
C   1.00 0.21 0.11
D    1.00 0.31
E     1.00
10. On the basis of income and population scores, group the
15 cities into clusters.
City Income Score Population Score
A 2.50 2.50
B 2.00 2.50
C 3.00 2.00
D 2.50 1.75
E 0.25 1.00
F 1.50 1.75
G 2.25 2.00
H 2.00 1.50
I 0.25 0.50
J 0.25 0.50
K 0.50 0.50
L 2.50 2.00
M 2.50 2.50
N 0.25 0.25
O 0.25 0.25
CHAPTER 12 MODEL BUILDING AND
DECISION MAKING
Objectives
After reading this Chapter, the learner would be able to:
 Understand fundamental concepts behind model building
 Know about various types of model
 Understand various methods prevalent for model building
 Appreciate the role of model building and its usage in a
decision making process
Structure
 Fundamentals of Model Building
 Types of Model
 Methods of Building Model
 Phases in Model Construction
 Model Building and Decision Making
 Application of Model Building in Decision Making
 Summary
 Review Questions
 Further Readings
FUNDAMENTALS OF MODEL BUILDING
Models play a very important role in research. Models simulate
descriptions and explanations of the operations of the system
that they represent. By experimenting with models one can
determine how the changes in the conditions will affect
performance of a system. Models enable us to experiment in a
cost effective manner than the system itself which is either
impossible or too costly.
A model may be defined as an idealized representation of a real
life system.
ADVANTAGES OF A MODEL
Models have many advantages over mere description of a
research problem. Some of them are as follows:
i) It depicts a research problem much more precisely.
ii) It provides a logical and systematic approach to the
research problem.
iii) It indicates the limitations and scope of the research
problem.
iv) It presents the overall structure of the research problem
more comprehensively.
v) It facilitates dealing with the problem in its totality.
vi) It enables the use of mathematical techniques to analyse
the research problem.
DISADVANTAGES OF A MODEL
Models have few disadvantages which are as follows:
i) Models are an attempt in understanding a research
problem and should never be considered an absolute
ever.
ii) The validity of any model with regard to the research
problem at hand can only be verified by carrying out
experiments and by characteristics of data thus obtained.
CHARACTERISTICS OF GOOD MODELS
i) It should be capable of adjustments with new
experimental situations without having any significant
change in its framework
ii) It should contain limited variables.
iii) A model should not consume too much time in its
construction
Activity 12.1
Go down the memory line to your school days and draw a
parallel form various physical models we sued to build for
science projects?













TYPES OF MODELS
There are two classes of models that are commonly used in
research activities, classified as follows:
A) Physical Models
B) Symbolic Models
PHYSICAL MODELS
These types of models give the appearance of the real system,
as such, these include toys and photographs. These models
easily depict the system but are not easily manipulated. This
makes them of little value for purpose of analysis and
prediction. Physical models are the least abstract of all models.
These are of two subtypes:
ICONIC MODELS
ANALOGUE MODELS
i) Iconic models: Iconic models represent the system as it
is but in different size. Thus Iconic models are obtained
by enlarging or reducing the size of the system. In other
words, they are images.
Some common examples are photographs, drawing,
model airplanes, ships engines globes, maps etc. A toy
airplane is an iconic model of a real one. Iconic models of
the sun and its planets are scaled down while the model
of the atom is scaled up so as to make it visible, to the
naked eye, Iconic Models have got some advantages as
well as disadvantages as follows:
Advantages:
i) These are specific and concrete
ii) These are easy to construct
iii) These can be studied more easily than the system itself.
Disadvantages:
i) These are difficult to manipulate for experimental
purposes
ii) They cannot be used to study the changes in the
operation procedures.
iii) It is not easy to make any modification or improvement in
these models.
iv) Adjustments with changing experimental situations
cannot be done in these models.
Activity 12.2
Would you consider map of a city as an iconic model of a city?
What would be disadvantages in use of such a model?











ii) Analogue Models: In analogue models one set of properties
is used to represent another set of properties. After the
problem is solved the solution is reinterpreted in items of
the original system.
For example, contour lines on a map are analogue of
elevation as they present the rise and fall of heights. Graphs
are analogues as linear lines are used to represent a wide
variety of variables such as time, percentage weight etc.
Advantage
Analogue models are easier to manipulate than iconic
models.
Disadvantages
Analogue models are less specific and less concrete.
Activity 12.3
Would you consider isobars and isotheres as an analogue model
of climate of a country? What could be advantage of such
depiction!













SYMBOLIC MODELS
In this class models, letter, numbers and other types of
mathematical symbols are used to represent variables and the
relationships between them. Thus symbolic models are some
kind of mathematical equations or inequalities reflecting the
structure of the system they represent. Inventory models,
queuing models etc. are the example of symbolic models. The
symbolic models are the most abstract models and, therefore,
usually general in nature. Symbolic models can be manipulated
easily and, therefore, of great value for analysis and prediction.
Hence, in research symbolic models are often used.
Symbolic models have following subtypes:
1. Mathematical Models
2. Function Models
3. Quantitative Models
4. Qualitative Models
5. Heuristic Models
1. Mathematical Models: Sometimes, models described by
means of mathematical symbols and equations are known
as mathematical models. For example, simulation model
uses mathematical formulae. This model is very
commonly used by the manager to simulate their
decision making process.
2. Function Models: Models may also be grouped
according to the mathematical function used. For
example, a function may serve to acquaint the analyst
with growth pattern of consumer demand.
3. Quantitative Models: are those models that can
measure the observations. A yardstick, a unit of
measurement of length value, degree of temperature,
etc. are quantitative models. Other example of
quantitative models are the transformation models that
help in converting a measurement of one scale into one of
the other scales (e.g. Logarithmic tables, Centigrade vc.
Fahrenheit conversion scale) and the test models that act
as standards against which measurements are compared
(e.g., a specified standard production control, business
dealings, the quality of a medicine).
4. Qualitative Models: are those that can be classified by
the subjective description in terms of numeric data.
Examples of these are the economic models and the
business models which represent the gathering and
representation of data pertaining to economic or business
research problem respectively.
5. Heuristic Models: These models are mainly used to
explore alternative strategic (course of action) which have
been overlooked previously, using mathematical models
to represent systems that define strategies.
Activity 12.4
Monte Carlo simulation can be used as quantitative model for
representing probabilistic situations. Do you agree?





METHODS OF BUILDING MODELS
Generally, following methodology is used for model building:
i) Analytical Methods: These methods involve all the tools
of classical mathematics, such as calculus, finite
difference etc. The kind of mathematical models
required for a particular research study depends upon
the nature of the study.
ii) Numerical Methods: Numerical methods are concerned
with the iterative (or trial and error) procedures,
through the use of numerical computations at each
step. These methods are generally used when some
analytical methods fail to derive the solution. In these
methods, we start with a trial solution and a set of
rules for improving it. The trail solution is improved by
the given rules and is then replaced by the improved
solution. The process is continued upto a certain step
after which no further improvement is possible.
iii) Monte Carlo technique or simulation: The basis of
Monte Carlo technique is random sampling of a
variables possible values. For this technique some
random numbers are required which may be converted
into random variates who behavior is known from past
experience. Darker and Kac define Monte Carlo
technique as a combination of probability methods and
sampling techniques providing solutions to
complicated partial or integral differential equations.
In short, Monte Carlo technique is concerned with
experiments on random numbers and it provides
solutions to complex problems. Monte Carlo
techniques are useful in following situations:
i. Where one is dealing with a problem which
have not yet arised, i.e., where it is not
possible to gain any information from past
experience.
ii. Where the mathematical statistical problems
are too complicated and some alternative
methods are needed.
iii. To estimate parameter of a model.
The main steps of Monte Carlo method are follows:
i) To get the general idea of the system, a flow diagram is
drawn.
ii) Then correct sample observation are taken to select some
suitable models for the system. In this step, some
probability distribution for the variable for our interest
is determined.
iii) Then the probability distribution is converted to a
cumulation function.
iv) Then a sequence of random number is selected with the
help of random number table.
v) Then a sequence of value of the variable of our interest is
determined with the sequence of random number
obtained in step (iv)
vi) Finally, some standard mathematical function is applied
to the sequence of value obtained in step(v).
ADVANTAGES OF MONTE CARLO METHODS
i) These are helpful in finding solution of complicated
mathematical expression which is not possible
otherwise.
ii) By these methods difficulties of trail and error
experimentation are avoided.
DISADVANTAGES OF MONTE CARLO METHODS
i) These are costly ways of getting a solution of any
problem.
ii) These methods do not provide optimal answer to the
problems. The answers are good only when the size of
the sample is sufficiently large.
Activity 12.5
The customers of the state distribution Corporation send their
own parched orders. In the past the arrival of these purchase
orders per day has approximated a normal distribution with
mean of 50 and a standard deviation of 6. In terms of the
probability of occurrence, the following is being indicated.
Develop a Monte Carlo simulation for the number of purchase
orders per day to be expected for a particular month. If the firm
can purchase only 41 orders per day, how many days in that
month will the firm be behind schedule?
Number of Purchase
orders
Probability of
Occurrence
2632 0.5
3238 2.0
3844 13.0
4450 36.0
5056 33.0
5662 13.0
6268 2.0
6874 0.5
2. A firm has single channel service station with following arrival
and service time probability distribution.
Arrival
(Min)
Probability
Service Time
(min)
Probability
1.0 0.35 1.0 0.20
2.0 0.25 1.5 0.35
3.0 0.20 2.0 0.25
4.0 0.12 2.5 0.15
5.0 0.08 3.0 0.05
The customers arrival at the service station is a random
phenomenon and the time between the arrival various from one
minutes to five minutes. The service time varies from one
minute to three minutes. The queuing process beings at 10.00
a.m. and proceeds for nearly 2 hours. An arrival goes to the
service facility immediately, if it is free. Otherwise it will wait in
a queue. The queue discipline is first come first served.
If the attendants wages are Rs. 8 per hour and the customers
waiting time cost Rs. 9 per hour then would it be an economical
proposition to engage a second proposition to engage a second
attendant ? Answer on the basis of Monte Carlo simulations
technique.
































PHASES OF MODEL CONSTRUCTION
The major phases of a model construction and use are as
follows:
1. Formulation of the problem
2. Construction of the model
3. Solution of the model
4. Validation of the model
5. Developing control over the solution
6. Implementation of the final solution.
The first phase model building requires the problem to be
formulated in an appropriate form. This should clearly yield a
statement of the problems elements that include the
controllable (decision) variables, the uncontrollable
parameters, the restrictions or constraints on the variables,
and the objectives for defining a good or improved solution.
The second phase of Model Construction is concerned with
the choice of proper structural elements and the
representation of interrelationship among the elements in
terms of mathematical formulae. A model should include
mainly the following three basic sets of components.
a) Decision Variables and Parameters: Decision
variables are those unknowns that are to be determined
from the solution of the problem, whereas parameters are
the given uncontrollable variables of the model. For
example, in a transportation problem, the quantities to be
transported to various destinations are decision variables
whereas the costs per unit of transportation from various
source locations to different destinations are parameters
of the model.
b) Constraints or Restrictions: The model must include
constraints in order to account for the physical limitations
of the system. For example, in a linear programming
model X
1
, X
2
denote the number of transistors of type T
1
and T
2
, respectively, to be manufactured (decision
variables) per day and let A
1
, A
2
be their respective unit
cost of production (parameters). If the total budget of the
manufacture allows a maximum of Rs. R to be spent per
day, then the corresponding constraint is A
1
X
1
+ A
2
X
2
R.
c) Objective Function: The model must also include an
objective function which defines the measure of
effectiveness of the system as a mathematical function of
its decision variables. For instance, if the objective of the
system is to minimize the total transportation cost, then
the objective function must specify the transportation
cost in terms of the decision variables. In fact, objective
function acts as an indicator for the achievement of the
optimum solution to the model and thus a poor
formulation of the objective function can only lead to a
poor solution to the problem.
The third phase of the model building deals with the
mathematical calculations for obtaining the solution to the
model. Frequently, a solution of the model means those
values of the decision variables that optimize one of the
objectives and give permissible levels of performance on any
other of the objectives.
The fourth phase of the model construction methodology
involves checking the validity of the model used. A model
may be said to be valid if it can give a reliable prediction of
the systems performance. A good researcher must realize
that his model must have a longer life and consequently he
updates the model each time to take accounts of the past,
present and future specifics of the problem.
The fifth phase in model building leads to control over the
solution by proper feedback of the information on variables
which deviated significantly. As soon as one or more of the
controlled variable change significantly, the solution goes out
of control. In such a situation, the model may accordingly be
modified.
The sixth and final phase of model building exercise deals
with the implementation of the tested results of the model.
This would basically involve a careful explanation of the
solution to be adopted and its relationship with the operating
realities. This phase of research investigation is executed
primarily through the cooperation of experts who are
responsible for managing and operating the system.
MODEL BUILDING AND DECISION MAKING
Model building uses the scientific methodology to understand
and explain the phenomena or operating systems. It devises
the theories to explain these phenomena, uses these
theories to describe what takes place under altered
conditions and checks these predictions against new
observations. Thus Model Building may be regarded as a tool
employed to increase the effectiveness of managerial
decisions as an objective supplement to the intuitive feeling
of the decisionmaker.
For instance, in distribution or allocation areas, modeling
may lead to the best locations for agencies, warehouses as
well as the most economical way of transportation. In
marketing area, it may aid in indicating the most profitable
type, use and size of advertizing campaigns, in regard to
available financial limits. Use of models may suggest
alternative courses of action when a problem is analysed and
solution is attempted. However, the study of complex
problems by building models becomes useful only when a
choice between two or more courses of action is possible.
Model building may be regarded as a tool that enables the
decisionmaker to be objective in choosing an alternative
from among many that he can conceive of. Following are the
salient advantages of using a model building approach in
decisionmaking.
i) Better Decision: Model, frequently yield actions that do
improve on intuitive decision making. A situation may
be so complex that the human mind can never hope to
assimilate multiple significant factors without the aid
of computer analysis, based on a well developed
model.
ii) Better Coordination: Sometimes models have been
instrumental in bringing order out of chaos. For
instance, a model oriented planning becomes a device
for coordinating marketing decision within the
constraints of manufacturing capabilities.
iii) Better Control: The Managements of large
organizations recognize that it is extremely costly to
have continuous supervision over routine decisions.
Model approach gives new freedom to the executives
to devote their attention to more strategic matters, by
leaving supervision of productions scheduling and
inventory replenishment, to computer aided models.
iv) Better Systems: Often, model building is initiated to
analyze a particular decision problem, such as whether
to open a new warehouse. Afterwards, the approach is
further developed into a system to be employed
repeatedly. Thus, the cost of undertaking the first
application may produce benefits.
PHASES AND PROCESSES OF MODELLING
Formulate the problem: This is the leading process; it
is normally lengthy and prolonged. This step constitutes
visits, observations, research, etc. with the assistance of
such activities; the researcher gets ample information
and support to formulate the problem. This process starts
with understanding of the organizational environment, its
objectives and hope.
Develop a model: After problem formulation, the next
step is to represent the problem in form of a
mathematical model which represents the systems,
environment or processes in the form of equations,
associations or formulas. We should identify the static
and dynamic structural basics of model with their
mathematical formulas which represent the
interrelationships among elements. The anticipated model
may be field tested and modified if the management is
not happy with the output that it gives.
Select appropriate data input: What You Seed is
What You Get is a famous quote in English language. No
model will work properly if the data input is inappropriate.
The aim of this step is to have adequate input to run and
test the model.
Solution of the model: After selecting the suitable data
input, one should go for getting the solution of model. If
the model is not functioning appropriately, then updation
and revision is done under this stage.
Validation of the model: A model is said to be valid if it
can provide a reliable prediction of the systems
performance. A model must be applicable for a longer
time and can be updated from time to time taking into
consideration the past, present and future aspects of the
problem.
Implement the solution: The aim of this step is to
eliminate the gap between researchers/ modeler and user
of the model. For this getting this, scientist as well as
management should play a positive role, because a
properly implemented modeling technique results in
improved management decisions.
VARIOUS TYPES OF MODELS
1. Ionic Model: It looks like what it represents. A
photograph and a painting are ionic models of persons or
objects. Commonly an ionic model represents a static
event. An ionic model is said to be scaled down when the
dimensions of the model are smaller than those of the
real objects; for example, a globe representing the earth.
A model is said to be scaled up when it is bigger than its
real entity; the physicists model of an atom or the sketch
of an insect.
2. Analogue model: It represents a system or an object of
an inquiry by utilizing a set of properties in different
forms what the original system possesses. For example, a
computer is the physical representation of the variables in
a problem.
3. Descriptive model: Descriptive models simply describe
some aspects of s situation based on observation, survey,
questionnaire results, or available data. The result of an
opinion poll represents a descriptive model.
4. Predictive models: Such models can answer what if
type of questions, i.e., they make predictions regarding
certain events. For example, based on survey results,
television networks attempt to explain and predict the
election outcome before all the votes are actually
counted.
5. Normative models: Finally, when a predictive model has
been repeatedly successful it can be used to prescribe a
source of action. Linear programming is normative or
prescriptive model, because it prescribes what the
managers ought to do.
6. Deterministic models: Such models assume conditions
of complete certainty and perfect knowledge. Linear
programming, transportation and assignment models are
examples of deterministic models.
7. Probabilistic models: These models handle those
situations in which the consequences or payoff of
managerial action cannot be predicted with certainty.
However, it is possible to forecast a pattern of events,
based on which managerial decision can be made. For
example, insurance companies are willing to insure
against risk of fire, accident, sickness and so on, because
the patterns of the events have been compiled in the
form of probability distributions.
8. Static models: These models do not consider the impact
of changes that take place during the planning horizon,
i.e., they are independent of time. Also, in static model
only one decision is needed for the duration of a given
time period.
SOME FAMOUS MODELS OF OPERATIONS RESEARCH
Linear Programming. It is a mathematical technique of
allotting a fixed amount of resources for optimizing a
linear function subject to a set of linear equations or
inequalities called as constraints.
Transportation Problem. The objective of
transportation problem is to minimize the total cost of
transporting a homogeneous commodity (product) from
supply centres to demand centres.
Assignment Problem. An assignment problem is a
particular case of transportation problem where the
objective is to assign a number of resources to an equal
number of activities so as to minimize total cost or
maximize total profit of allocation.
Game Theory. These models are used to determine the
behavior of decision making under conflicting situations.
Methods for solving such models have not been found
suitable for industrial applications, mainly because they
are referred to an idealistic world neglecting many
essential features of reality.
Inventory Models. These models are concerned with the
determination of the economic order quantity and
production intervals considering the factors such as
demand per unit time, cost of placing orders, acquisition,
storage, handling of inventories etc. It helps managers to
take decision regarding inventory management.
Replacement Models. These models deal with the
determination of optimum replacement policy in situation
that arise when some items or machinery need
replacement by a new one.
Markov Process. These models are applicable in such
situations where the state of the system can be defined
by some descriptive measure of numerical value and
where the system moves from one state to another on a
probability basis. Brandswitching problems considered
in marketing studies are an example of such models.
Simulation. Simulation is a representation of reality
through the use of a model or other device which will
react in the same manner as reality under a given set of
conditions.
Dynamic Programming. Dynamic programming is a
mathematical technique dealing with the optimization of
multistage decision process.
Nonlinear Programming. This method usually refers to
the problem in which the objective functions become non
linear, or one or more of the constraint inequalities have
non linear relationship of both.
Integer Programming. These methods may be used for
those variables which are restricted to integer (or
discrete) value. Examples for this method are the number
of trucks in a fleet, the number of generators in a power
house, etc.
Job Sequencing Models. These models involve the
selection of such a sequence of performing a series of
jobs to be done on service facilities (machines) that
optimize the efficiency measure of performance of the
system. The scheduling of service or sequencing of jobs is
done to minimize the relevant costs. Network
Scheduling Models. These models are applicable in
large projects involving complexities and inter
dependencies of activities. Project Evaluation and Review
Technique (PERT) and Critical Path Method (CPM)
are used for planning, scheduling and controlling complex
project which can be characterized as net works.
Goal Programming. Goal Programming (GP) is an
approach used for solving a multi objective optimization
problem that balances trade off in conflicting objectives.
In other words it is a powerful approach of deriving a best
possible satisfactory level of goal attainment.
Symbolic Logic. It deals with substituting symbols for
words, classes of things, or functional systems. It
incorporates rules, algebra of logic, and propositions.
There have been only limited attempts to apply this
technique to business problems; however, it is
extensively used in designing computing machinery.
Queuing Theory. Queuing theory is concerned with the
statistical description of the behavior of queues with
finding, e.g. the probability distribution of the number in
the queue from which the mean and variance of queue
length and the probability distribution of waiting time for
a customer, or the distribution of a servers busy periods
can be found.
Information Theory. It is an analytical process
transferred from the electrical communications field to
operations research. It seeks to evaluate the
effectiveness of information flow within a given system
and helps in improving the communication flow.
APPLICATION OF MODEL BUILDING IN
DECISION MAKING
APPLICATIONS OF DYNAMIC PROGRAMMING
The technique of Dynamic Programming was developed by
Richard Bellman in the early 1950. Bellmans Principle of
Optimality states that:
An optimal policy (set of decisions) has the property that
whatever be the initial state and initial decisions, the
remaining decisions must constitute on optimal policy for the
state resulting from the first decision.
Bellmans Principle of Optimality is considered as a
multistage decision problem.
Multistate: A problem in which the decision have to be
made at successive stages is called a multistage decision
problem. Multstage problem can be classified on the basis of
following properties:
1. The outcome of a decision may be deterministic of
stochastic (probabilistic). In case of deterministic, if the
state of a process is given the outcome of a decision at
any state is uniquely determined and known. In
stochastic case, there is a set of possible outcomes given
by a known probability distribution.
2. The number of possible decision at any stage, from which
we have to choose one may be finite or infinite.
3. The total number of stages in the process may be finite or
infinite and may be known as unknown.
CHARACTERISTICS OF DYNAMIC PROGRAMMING PROBLEM
The basic features which characterize dynamic programming
problem may be presented as follows:
1. The problem can be divided up into stages, which a policy
decision required at each stage.
2. Each stage has a number of states associated with it.
3. The effect of the policy decision at each stage is to
transform the current stage into a stage associated with
the next stage.
This suggests that dynamic programming problem can be
interpreted in term of network.
4. Given the current state, an optimal policy for the
remaining stages is independent of the policy adopted in
previous stages.
For the dynamic problem, in general knowledge of the
current state of the system conveys all the formation
about its previous behavior necessary for determining the
optimal policy henceforth. This is a Markovian property.
5. The solution procedure begins by finding the optimal
policy for each state of the last stage.
6. A reclusive relationship (functional equation) is available
which identifies the optimal policy for each state with n
stages remaining, given the optimal policy for each stage
with (n1) stage remaining.
7. Using the reclusive relationships (function equation), the
solution procedure moves backward stage by stage
each time finding the optimal policy.
APPLICATION OF QUEUING THEORY
Historical development: Theoretical research into the
properties of queues first of all started in the problem of
telephone calls by a Swede engineer A.K. Erlanz in 1950. A
systematic approach to the problem was developed by D.G.
Kendall in 1951 by using model terminology and since then
significant work has been done in this direction. Now queuing
theory has been applied to a wide variety of operations. The
basic feature is that arrivals at the service stations are
eventually discharged after service is provided.
The Behaviour of Customers: can be classified as follow:
The human behavior and the facilities of servicing in any
system are important factors for the development of queuing
model. Customers behavior can be classified in following
categories:
i) Balking: A customer may not like to join the queue as he
does not want to
ii) Reneging: He may leave the queue due to impatience.
iii) Collusion: Several customers may collaborate and only
one of them may stand in the queue.
iv) Jockeying: If there are a number of queues then one
may leave one queue to join another.
The Facilities of Service Station can be classified as follows:
a) Single Channel: There may be only one counter for
servicing and as such only one unit can be served at a
time.
b) Multi Channel: Due to rush of customers management
may decide to provide a number of counters so that
queue length may not become unreasonably large and
the organization may not lose customers due to long
queue. But too many counters may result in long idle
time due to shortage of customers. The following are the
various definitions in a queuing problem:
i) Queue length: Number of persons waiting in the
line at any time.
ii) Waiting time: It is time upto which a unit has to
wait before it is taken into service after arriving at
the servicing station.
iii) Servicing: It is the time taken for servicing of a
particular arrival.
iv) Average length of line: The number of customers
in the queue per unit of time.
v) Average idle time: The average idle time for
which the system remains idle.
vi) Queue discipline: It gives the information about
what happens between the moment of arrival and
when it leaves the system.
Classification of Queues and their Problems:
The mathematical description of a Queue can be formulated by
means of a model expressed as A/B/S: (d/e), where
A: Arrival pattern of the units, given by the probability
distribution of inter arrival time of unit.
B: Probability distribution of servicing time of individuals
being actually serviced.
S: Number of servicing channels in the system.
d: Capacity of the system, i.e. the maximum number of
units the system can accommodate at any time.
e: Manner in which units are selected for service in the
system.
The following are some of the ways in which units are selected
for service.
i) FIFO = First In (arrival) First Out (departure). This
practice is generally followed at booking stations of
Railways, Bus Stands, Theatres etc.
INPUT QUEUE SERVICE OUTPUT DEPARTUR
E
QUEUE
QUEUE
QUEUE
SERVICE
SERVICE
SERVICE
OUTPU
T
INPU
T
DEPARTU
RE
Arrival
for
Servic
e
Arrival waiting
for service
One item being served at
time single channel
Pattern of a Multichannel
queue discipline
ii) SIRO = Service in Random Order: This is generally done
while censuring a large bundle of letter posted to
foreign countries. The items for servicing are selected
at random.
iii) LIFO = Last In First Out. This thing generally happens in
the case of office files. The files are heaped over one
another and the one arriving last is attended first
being at the top.
We can see that under fixed conditions of customers arrivals
and servicing facility a queue length is a function of time. As
such a queue system can be considered some sort of random
experiment and the various events of the experiment can be
taken to be the various changes occurring in the system at any
time. The various characteristics of queuing disciplines can be
studied with the help of probability distributions.
Example 1. A television repairman finds that the time spent on
his jobs has exponential distribution with a mean of 30 minutes.
If he repairs sets in the order in which they came in, and if the
arrival of sets follows a poisson distribution approximately with
an average rate of 10 per 8 hour day, what is the repairmans
expected idle time each day? How many jobs are ahead of the
average set just brought in?
Solution: From the data of the problem, we have
= 10/8 =5/4 sets per hour; and =(1/30) 60 = 2 sets per
hour
(a) Expected idle time of repairman each day
Number of hours for which the repairman remains busy in an 8
hour day (traffic intensity) is given by.
(8) (/) = (8) (5/8) = 5 hours
Hence , the idle time for which the repairman remains busy in
an 8 hour day will be:(8 5) =3 hours
(b) Expected (or average) number of TV sets in the system.
5/ 4 5
2 ( .)
2 (5/ 4) 3
s
L approx TVsets
= = = =
= 2
(approx.) TV sets.
Example 2. In a railway marshalling yard, goods trains arrive
at a rate of 30 trains per day. Assuming that the interarrval time
follows an exponential distribution and the service time (the
time taken to hump a train) distribution is also exponential with
an average of 36 minutes. Calculate.
(a) Expected queue size (line length)
(b) Probability that the queue size exceeds 10.
If the input of trains increases to an average of 33 per day,
what will be the change in (i) and (ii)?
Solution: From the data of the problem, we have
=30/60x24 = 1/48 trains per minute and =1/36 trains per
minute. Then traffic intensity, = /= 36/48 = 0.75
(a) Expected queue size (line length)
0.75
3
1 1 0.75
s
L Trains
= = =
Probability that the queue size exceeds 10
P(n10) =
10
= (0.75)
10
= 0.06
Now, if the input increases to 33 trains per day, then we
have
= 33/60 x 24 = 11/480 trains per minute and = 1/36
trains per minute
Thus, traffic intensity, = / = (11/480)(36)=0.83
Hence, recalculationg the values for (i) and (ii)
0.83
5 ( .)
1 1 0.83
s
L trains approx
= = =
P(n10) =
10
= (0.83)
10
= 0.2 (approx.)
Example 3: Consider a single server queuing system with
passion input, exponential services times. Suppose the mean
arrival rate is a 3 calling units per hour, the expected service
time is 0.25 hour and the maximum permissible calling units in
the system is two. Derive the steady state probability
distribution of the number of calling units in the system, and
then calculate the expected number in the system.
Solution: From the data of the problem, we have
=3 units per hour; = 4 units per hour and N = 2.
Then traffic intensity, =/ =3/4 =0.75
The steady state probability distribution of the number of n
customers (calling units) in the system is
( )
1
1
; 1
1
n
n N
P
= =
( )
2 1
1 0.75 (0.75)
(0.43)(0.75)
1 (0.75)
n
n
+
= =
and
( ) ( )
2 1 3 1
(1 ) 1 0.75 0.25
0.431
1
1 0.75 1 0.75
o N
P
+ +
= = = =
The expected number of calling units in the system is given by
( ) } {
2
1 1
2
2
1
(0.43)(0.75)
0.43 (0.75) (0.43) 0.75 2(0.75) 0.81
N
n
s n
n n
n
n
L nP n
n
= =
=
= =
= = + =
APPLICATION OF SIMULATION
Introduction: In spite of powerful tools like linear
programming, dynamic programming, Quantitative techniques
all the problems encountered cannot be solved by these
approaches. In simulation we work with a model of a system
and study its behavior under circumstances that are not
amenable to direct analytical solution. Thus effects of alternative
policies without tampering with the actual system can be
studied. In a sense, simulation serves as a managements
laboratory to evaluate various alternatives.
a) Stages in Simulation: The first stage is to construct the
model, which represents the relevant characterstics of the
real world system. The second stage in simulation is
concerned with operation of the model. In general, the
operation phase involves a generation of synthetic output
data. From these data how the corresponding real world
would function under similar circumstances is inferred.
b) The Monte Carlo Simulation: Monte Carlo simulation is
one of extremely simple operations research tool. It is
patterned after the game of roulette except it uses
numbers, which have equal chance of being selected. The
details of Monte Carlo technique have been covered
earlier in Chapter 12.3.
Specific Examples of Simulation in Industry:
1. Simulation of operation at large airports to test changes
in present policies and practices.
2. Traffic across junctions to determine best time sequence.
3. Maintenance operations to determine optimal size of
crews.
4. Simulation of countrys economy to produce the effect of
economic policy divisions.
5. Large scale inventory control problems.
6. To evaluate defensive and offensive weapons systems.
7. Communication systems to determine components
required for satisfactory service.
8. Overall business operations.
9. Usage of rivering systems to determine best
configurations of dams, power plants and irrigation works.
10.Job shop simulation.
Example on Monte Carlo Simulation Technique
A firm has a channel service station with the following arrival
and service time probability distributions:
Interarrival Time
(minutes)
Probability
Service Time
(minutes)
Probability
10 0.10 5 0.08
15 0.25 10 0.14
20 0.35 15 0.18
25 0.25 20 0.24
30 0.10 25 0.22
30 0.14
The customers arrival at the station is a random
phenomenon and the time between the arrivals varies from
10 minutes. The service time varies from 5 minutes to 30
minutes. The queuing process begins at 10 a.m. and
proceeds for nearly 8 hours. An arrival goes to the service
facility immediately, if it is free. Otherwise it will wait in a
queue. The queue discipline is first come first served.
If the attendants wages are Rs 10 per hour and the
customers waiting time costs Rs. 12 per hour, then would it
be an economical proposition to engage a second attendant?
Answer using Monte Carlo simulation technique.
Solution: The cumulative probability distribution and
random number interval both for interarrival time and
service time are shown in Tables 1 and 2, respectively.
Table 1
Interarrival Time
(minutes)
Probability
Cumulative
Probability
Random Number
Interval
10 0.10 0.10 0009
15 0.25 0.35 1034
20 0.35 0.65 3564
25 0.25 0.90 6589
30 0.10 1.00 9099
Table 2
Interarrival Time
(minutes)
Probability
Cumulative
Probability
Random Number
Interval
5 0.08 0.08 0007
10 0.14 0.22 0821
15 0.18 0.40 2239
20 0.24 0.64 4063
25 0.22 0.86 6485
30 0.14 1.00 8699
The simulation worksheet developed to the given problem is
shown in Table 3.
Arrival
Number
Random
Number
Arrival
Interval
Arrival
Time
Service
Time
Waiting
Time
Random
Number
Service
Time
Exit
Time
Time in
System
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)=(6)+(8)
1 20 15 15 15 0 26 15 30 15
2 73 25 40 40 0 43 20 60 20
3 30 15 55 60 5 98 30 90 35
4 99 30 85 90 5 87 30 120 35
5 66 25 110 120 10 58 20 140 30
6 83 25 135 140 5 90 30 170 35
7 32 15 150 170 20 84 25 195 45
8 75 25 175 195 20 60 20 215 40
9 04 10 185 215 30 08 10 225 40
10 15 15 200 225 25 50 20 245 45
11 29 15 215 245 30 37 15 260 45
12 62 20 235 260 25 42 20 280 45
13 37 20 255 280 25 28 15 295 40
14 68 25 280 295 15 84 25 320 40
15 94 30 310 320 10 65 25 345 35
From the 15 sample of waiting time, 225 minutes and the time
spent 545 minutes by the customer in the system, we compute
all average waiting time in the system and average time as
follows;
Average waiting time = 225/15=15 minutes.
Average service time = 545/15=36.33 minutes.
Thus, the average cost of waiting and service is given by
Cost of waiting =15 x (15/60) = Rs 3.75 per hour
Cost of service =10 x (36.33/60) = Rs 6.05 per hour
Since average cost of service per hour is more than average
cost of waiting per hour, therefore, it would not be an
economical proposition to engage a second attendant.
SIMULATION OF MAINTENANCE PROBLEMS
Example 2: A plant has a large number of similar machines.
The machines breakdowns or failures are random and
independent. The shift in charge of the plant collected the data
about the various machines breakdown times are the repair
time required on hourly basis, and the record for the past 100
observations as shown below was :
Time Between
Recorded Machine
Breakdowns (hours)
Probability
Repair Time
Required (hours)
Probability
0.5 0.05 1 0.28
1 0.06 2 0.52
1.5 0.16 3 0.20
2 0.33
2.5 0.21
3 0.19
For each hour that one machine is down due to being to waiting
to be repaired, the plant loses Rs. 70 by way of lost production.
A repairman is paid at Rs. 20 per hour.
(a) Simulate this maintenance system for 15 breakdowns.
(b) How many repairmen