Sei sulla pagina 1di 180

I T M

UNIVERSITY

ONLINE
Research Methodology

Table of Content eBook

1. Introduction to Research Methodology 6

1.1 Introduction 7

1.2 Definition of Research 8

1.3 Research Methods and Research Methodology 8

1.4 Objectives of Research 9

1. 5 Motivation for Conducting Research 10

1. 6 Criteria of Good Research 10

1. 7 Characteristics of Research 11

1.8 Types of Research 12

1. 9 Steps Involved in Research Process 16

1.10 Role of Research in Business 19

1.11 Chapter S u m m a r y 20

2. Research Problem Formulation and Research Design 21

2.1 Introduction 22

2.2 Definition of Research Problem 23

2.3 Procedure of Defining General Research Problem 24

2.4 Objectives of Research Design 27

2. 5 Research Design 28

2.6 Contents of Research Design 28

2. 7 Important Concepts in Research Design 29

2.8 Types of Research Design 31

2. 9 Basic P r i n c i p l e s of Experimental Design 33

2.10 Chapter S u m m a r y 34

3. Sampling Design and Sampling Techniques 35

3.1 Introduction 36

3.2 Population, Census, and Sample 37

3.3 Sampling Design 38

3.4 S a m p l e Design Procedure 38

3.5 Characteristics of a Good Sample Design 41

3.6 Criteria for Selecting a Sampling Procedure 41

3.7 Types of S a m p l i n g Techniques 43

3.8 Chapter S u m m a r y 49

www.itmuniversityonline.org Page 2
Research Methodology

Table of Content eBook

4. Methods and Tools of Data Collection 50

4.1 Introduction 51

4.2 Data Types 52

4.3 Questionnaire Design 58

4.4 Requirements of a Good Questionnaire 61

4.5 Case Study 62

4.6 Chapter S u m m a r y 64

5. Measurement and Scaling Techniques 65

5.1 Introduction 66

5.2 Measurement and Scaling 67

5.3 Primary Scales of Measurement 68

5.4 Classification of Scaling Techniques 71

5.5 Comparative Scales 72

5.6 Categorical Scales 74

5.7 Chapter S u m m a r y 80

6. Tabulation and Analysis of Data 81

6.1 Introduction 82

6.2 Tabulation 83

6.3 M u l t i p l e Regression Analysis 85

6.4 M u l t i p l e D i s c r i m i n a n t Analysis 85

6. 5 Measures of Central Tendency 86

6.6 Measures of Dispersion 94

6. 7 Measures of Skewness 96

6.8 Measures of Relationships 97

6. 9 Association of Attributes 100

6.10 Time Series Analysis and Index Number 101

6.11 Chapter S u m m a r y 103

www.itmuniversityonline.org Page 3
Research Methodology

Table of Content eBook

7. Hypothesis Testing 104

7.1 Introduction 105

7.2 Hypothesis 106

7.3 Types of Hypothesis 107

7.4 Terminologies Used in Hypothesis Testing 108

7.5 Procedure of Testing of Hypothesis 112

7.6 Parametric and Non-parametric Testing 113

7. 7 Testing of Hypothesis for Mean 115

7.8 Testing of Hypothesis for Variance 119

7.9 Testing of Hypothesis for Correlation Coefficients 121

7.10 Limitations of Testing of Hypothesis 122

7.11 Chapter S u m m a r y 123

8. Analysis of Variance (ANOVA) 124

8.1 Introduction 125

8.2 Analysis of Variance (ANOVA) 126

8.3 Why Analyze Variance? 128

8.4 Variability Measure by One-way ANOVA 129

8.5 ANOVA Technique 131

8.6 One-way ANOVA - Example 132

8.7 Two-way ANOVA 136

8.8 Analysis of Covariance (ANOCOVA) 138

8.9 Chapter S u m m a r y 139

9. Non-parametric Testing and Chi-Square Test 140

9.1 Introduction 141

9.2 Non-parametric Test 142

9.3 C h i - s q u a r e Test 144

9.4 S i g n Test 149

9.5 Run Test 151

9.6 Spearman's Rank Correlation 153

9.7 Kendall's test 155

9.8 Wilcoxon Matched-pairs Test 159

9.9 M a n n - W h i t n e y U Test 160

9.10 Chapter S u m m a r y 162

www.itmuniversityonline.org Page 4
Research Methodology

Table of Content eBook

1 0 . Research Report Writing 163

10.1 Introduction 164

10.2 Meaning and Importance of Research Report 165

10. 3 Steps in Writing Research Report 167

10.4 Report Format 169

10. 5 P r e l i m i n a ry Parts of Research Report 169

10. 6 Main Body of Research Report 173

10. 7 Types of Research Report 177

10.8 Chapter S u m m a r y 181

www.itmuniversityonline.org Page 5
I n t r o d u c t i o n to

Research

Methodology
Research Methodology

01. Introduction to Research Methodology eBook

1.1 Introduction

The term research is used progressively for any kind of exploration that is intended to

discover interesting or new facts. Research is used to stipulate a variety of explorations

significant to a wide range of subjects, such as leisure studies and sports, hospitality,

healthcare and nursing studies, the natural sciences, social sciences, the environment,

social anthropology, psychology, politics, business, education, and the h u m a n i t i e s .

Various university courses include research that students must carry out independently,

in the form of projects, dissertations and thesis, and the more advanced the degree, the

greater the research content.

After reading t h i s chapter, you will be able to:

Define research

Explain research methods and research methodology

E x p l a i n the objectives of research

List motivations for conducting research

Enumerate the characteristics of research

Identify the different types of research

Describe steps involved in research process

E x p l a i n the role of research in business

www.itmuniversityonline.org Page 7
Research Methodology

01. Introduction to Research Methodology eBook

1 . 2 D e f i n i t i o n of Research

Generally, research refers to a search for knowledge. It can also be defined as a

scientific and systematic exploration for appropriate information on a special subject

matter. Research is one of the ways to find a good solution for problems, by

investigating and analyzing information in a scientific way. In a technical sense, research

comprises an academic activity. According to some people, research is a movement, a

movement from the known to the unknown.

According to Clifford Woody, "Research comprises defining and redefining

problems, formulating hypothesis or suggested solutions; collecting, organising

and evaluating data; making deductions and reaching conclusions; and at last

carefully testing the conclusions to determine whether they fit the formulating

hypothesis."

In the words of Grinnell, "Research is a structured inquiry that utilizes

acceptable scientific methodology to solve problems and creates new

knowledge that is generally acceptable."

D. Slesinger and M. Stephenson, in the encyclopedia of Social Sciences, defined

research as, "the m a n i p u l a t i o n of things, concepts or symbols for the purpose of

generalising to extend, correct or verify knowledge, whether that knowledge

aids in construction of theory or in the practice of an art."

In other words, research is the search for knowledge t h r o u g h objective and systematic

method of f i n d i n g solution to a problem. Research is a systematic approach that deals

with generalization and the formulation of a theory. It also includes formulating a

hypothesis, enunciating the problem, analyzing the facts, collecting the facts or data,

and reaching certain conclusions.

1.3 Research Methods a n d Research M e t h o d o l o g y

Research Methods

Research methods and research methodology differ. Methods/techniques that are used

during the course of conducting the research are known as research methods. The study

of research methods offers training to apply various methods to solve the research

problem.

www.itmuniversityonline.org Page 8
Research Methodology

0 1 . Introduction to Research Methodology eBook

Research methods can be categorized into following three g r o u p s :

Methods that are concerned with the collection of data. Example: Questionnaire

method, interview, etc.

Statistical t ec h n i q u e s that are used for building relationships between the variables

u n d e r study. Example: Regression and correlation analysis, etc.

Methods used for calculating the accuracy of the results. For example, testing of

hypothesis, etc.

Research Methodology

Research methodology is a way to consistently solve the research problem. In a

research, various steps are generally adopted by a researcher. Methodology refers to the

procedure, theory or study of methods by which knowledge is gained. Research

methodology is the procedure by which researchers predict, explain, and describe their

work.

Research methodology gives necessary training in scientific tools, materials, choosing

methods, and techniques relevant for the problem chosen. Also, research methodology

is concerned with the explanation of questions, like:

Why the particular research study is undertaken?

How to formulate the research problem?

Why the particular technique of analysis of data is u s ed ?

1 . 4 Objectives of Research

Research discovers answers to questions through scientific procedures. The main

intention of research is to find out the hidden truth. Research study has its own specific

purpose.

The major objectives of research are:

Exploration: To gain familiarity in the phenomenon and achieve new i n s i g h t .

Description: To describe accurate characteristics of individuals under

consideration.

Diagnostic: To determine the frequency with which a certain t h i n g h a p p e n s or with

w h i c h it is associated with something else.

Explanation: To test the causal relationship between the variables.

www.itmuniversityonline.org Page 9
Research Methodology

0 1 . Introduction to Research Methodology eBook

1 . 5 M o t i v a t i o n f o r Conducting Research

There are various factors motivating people to undertake research studies:

1. Yearning to get a research degree, moreover, with its substantial advantages.

2. To face the c h a l l e n g e in solving unsolved problems.

3. To get the inventive joy of doing some stimulating work.

4. Desire to serve society.

5. To earn a good virtue.

Note:

This is not a comprehensive list of factors motivating people to undertake research

studies.

1 . 6 Criteria of Good Research

The research has to satisfy the following criteria:

1. The purpose of the research should be clearly defined.

2. The research procedure used should be described in sufficient detail, in order to

permit another researcher to repeat the research for further advancement,

keeping the continuity of what has already been attained.

3. The procedural design of the research should be carefully planned to yield results

that are as objective as possible.

4. The researcher should report with complete frankness the flaws in procedural

d e s i g n and estimate their effects on the findings.

5. The analysis of data should be sufficiently adequate to reveal its significance and

the methods of analysis used should be appropriate. The v a l i d i t y and r e l i a b i l i t y of

the data should be checked carefully.

6. Conclusions should be confined to those justified by the data of the research and

limited to those for which the data provide an adequate basis.

7. Greater confidence in research is warranted if the researcher is experienced, has a

good reputation in research, and is a person of integrity.

Source: 1. James Harold Fox, Criteria of Good Research, Phi Delta Kappan, Vol. 39 (March, 1 9 5 8 ) , pp.

285-86. 2. Danny N. Bellenger and Barnett, A. Greenberg, "Marketing Research-A Management

Information Approach", p. 107-108

www.itmuniversityonline.org Page 10
Research Methodology

01. Introduction to Research Methodology eBook

1 . 7 Characteristics of Research

Various terms are used to check the validity and fairness of the research; the success of

a n y research d e p e n d s on these terms. Some characteristics of research are:

Reliability

This is a prejudiced term that cannot be measured precisely. Often, various techniques

or instruments are used to measure the reliability of any research accurately. A reliable

research is that which yields similar results every time it is undertaken, with similar

population and procedures. Reliability refers to the repetition of a n y research, research

instrument, tool or procedure. Reliability present in the research is proportional to the

n u m b e r of s i m i l a r results produced.

Validity

Validity refers to the effectiveness with which you approximate research conclusions,

a s s u m p t i o n s or propositions, true or false. The applicability of any research d e p e n d s on

its validity. T h e validity of the research instrument can be defined as the suitability of

the research instrument to the research problem or how accurately the instrument

measures the problem. Defining concepts in the best possible manner can keep the

research on-track so that no errors occur during measurement.

Accuracy

Accuracy refers to the degree to which each research process, instrument, and tool is

related to each other. It measures whether research tools have been selected in the best

possible manner and research procedures suit the research problem or not. The

accuracy of research can be improved by choosing the best data collection tool.

Credi bi I ity

C r e d i b i l i t y comes with the use of the best source of information and the best procedures

in research, as secondary data has been manipulated by humans and is therefore, not

very valid to use in research. So the research might complete in less time but its

credibility will be at stake. Instead of the least credible primary data, a certain

percentage of secondary data can be used. The credibility of a research can be increased

by g i v i n g accurate references.

Generalizability

This refers to the applicability of research findings to a larger population. A researcher

takes a s m a l l sample from the target population to conduct the research. As the sample

www.itmuniversityonline.org Page 11
Research Methodology

0 1 . Introduction to Research Methodology eBook

is merely a representative of the population, the findings should also be the same. If

research findings can be applied to any sample from the population, the results of the

research are said to be generalizable.

Empirical

Research has been tested for accuracy and is based on real life experiences.

Quantitative research is easier to prove, scientifically, than qualitative research.

Systematic

According to this approach for research, no research can be conducted haphazardly.

Each step must follow the other. There are a set of procedures that have been tested

over a period of time and are thus, suitable to use in research.

Controlled

When similar events are tested in research, due to the broader nature of factors that

affect that event, some factors are taken as controlled factors, while others are tested

for the possible effect. The controlled factors or variables should have to be controlled

rigorously. In pure sciences, it is very easy to control such elements because

experiments are conducted in the laboratory but in social sciences, it becomes difficult to

control these factors due to the nature of research.

Source: 1. http://gulnazahmad.hubpages.com/hub/research-methodology 2. James Harold Fox, Criteria of

Good Research, Phi Delta Kappan, Vol. 39 (March, 1958), pp. 285-86. 3. Danny N. Bellenger and Barnett,

A. Greenberg, "Marketing Research-A Management Information Approach", p. 107-108

1 . 8 Types of Research

The basic types of research are:

Descriptive vs. Analytical

Descriptive research includes surveys and fact-finding enquiries of different kinds.

Descriptive research refers to one that provides an accurate portrayal of the

characteristics of a particular individual, situation or a group. It is also known as

statistical research. It deals with everything that can be enumerated and s t u d i e d, which

has an impact on the lives of the people it deals with. Example: frequency of customers,

preferences of students or similar data. In analytical research, a researcher analyzes

facts or information already available, to make a critical evaluation of the material.

www.itmuniversityonline.org Page 12
Research Methodology

0 1 . Introduction to Research Methodology eBook

Applied vs. Fundamental

Applied research is a scientific study used to analyze practical problems. Solutions to

everyday problems, cure of illnesses, and developing innovative techniques, rather than

acquiring knowledge, can be obtained using applied research. For example, to increase

a g r i c u l t u r a l crop production or the energy efficiency of hospitals, machineries, etc.

Research concerned with generalizations and the formulation of a theory is called

fundamental research. In other words, gathering knowledge for the sake of knowledge is

termed 'pure', 'basic' or 'fundamental' research. In this, you do research on natural

phenomenon or topics relating to pure mathematics are included. Research studies

concerning human behavior carried on with a view to make generalizations about human

behavior, basic science probe for answers to questions such as 'how did the universe

b e g i n ? ' come u n d e r fundamental research.

Quantitative vs. Qualitative: Quantitative research is based on scientific methods, in

w h i c h data is related to the measurement of quantity or a m o u n t . In case of quantitative

phenomena, quantitative research should be conducted and if the phenomenon is

qualitative in nature, qualitative research should be conducted. To develop and employ

mathematical theories, models and/or hypotheses pertaining to the phenomenon, is the

objective of quantitative research. For example, research related to the development of

machines and tools for measurements.

Qualitative research aims to collect detailed information of human attitude and the

reasons that a d m i n i s t e r such attitude. Research designed to find out how people feel or

what they t h i n k about a particular subject or institution is also q u a l i t a t i v e research.

Conceptual vs. Empirical: Philosophers and thinkers use this type of research to

develop new concepts or reinterpret existing ones. Empirical research is data-based

research, coming u p with conclusions. These conclusions are capable of being verified by

observation or experiment. This type of research is also known as experimental type of

research.

For the purpose of controlling and predicting phenomenon and e x a m i n i n g p r o b a b i l i t y and

causality among selected variables, an objective, systematic, and controlled

investigation, that is, experimental research is required.

Source: P a u l i n e V. Young, Scientific Social Surveys and Research, p. 30.

www.itmuniversityonline.org Page 13
Research Methodology

01. Introduction to Research Methodology eBook

Other Types of Research

The objective of exploratory research is to analyze the data and explore the p o s s i b i l i t y of

obtaining as many relationships as possible between variables, without knowing their

end-applications. It provides a basis for general findings and provides a better

understanding of the situation. It uses a survey and observation method for research

findings. For example, finding the various causes for decrease in the revenue of a

particular car segment.

The objective of co-relational or causal research is to discover or establish the existence

of relationship/association/interdependence between two or more aspects of situations

or variables. For example, finding the impact of incentives on the productivity of the

workers, keeping other elements unchanged.

Significance of Research

According to Hudson Maxim, "All progress is born of inquiry. Doubt is often

better than overconfidence, for it leads to inquiry, and inquiry leads to

invention." Progress can be made possible by increasing the quantity of research.

Scientific and inductive thinking and the development of logical habits of t h i n k i n g and

organizations are included in the research.

Research is very useful in various fields, like applied economics, business, and medical

fields and is on the increase, day-by-day. Due to the complicated nature of business,

research is used to a large extent to solve complicated operational problems.

In the economic system, research gives a basis for all government policies. A big part of

the government's budgets are based on an analysis of the needs and desires of the

people and on the a v a i l a b i l i t y of revenues to meet these needs.

Research has a great importance in solving many planning and operational problems of

industries and businesses. Market and operations research, along with motivational

research, are the most important terms in taking business decisions.

In market research, investigation of the development and structure of a market for the

purpose of formulating efficient policies for production, sales, and purchasing is done.

Operations research includes the application of mathematical, logical, and analytical

methods to obtain solutions to business problems of cost minimization or of profit

maximization.

www.itmuniversityonline.org Page 14
Research Methodology

0 1 . Introduction to Research Methodology eBook

The significance of research can also be explained with the following points:

To students that have to write a master's or Ph.D. thesis, research may mean

careerism or a way to attain a higher position in the social structure.

To professionals in research methodology, research may mean a source of

livelihood.

To philosophers and thinkers, research may mean the outlet for new ideas and

insights.

To literary men and women, research may mean the development of new styles

and creative work.

To analysts and intellectuals, research may mean generalizations of new theories.

Thus, research is the fountain of knowledge for the sake of knowledge and an important

source of providing guidelines for solving different business, governmental, and social

problems. It is a sort of formal training that enables one to understand the new

developments in one's field in a better way.

Source: C. R. Kothari, Research Methodology: Methcx:ls and Techniques, New Age International Publishers,

2nd Edition

!
Application Objective I n q u m n g Mode

Perspective Perspective Perspective

Descriptive
Qualitatlve
Pure Research
Research
Research

(Structured
Explanatory
Applied
Approach)
Research
Research

Exploratory
Quantitative
Research
Research

(Unstructured
Co-relational
Approach)
Research

Fig. 1.Sa: Types of Research

www.itmuniversityonline.org Page 15
Research Methodology

0 1 . Introduction to Research Methodology eBook

1 . 9 Steps Involved I n Research Process

The research process can be summarized in the following eight steps:

Research

.. .. ..
Research
Literature Hypothesis Design
Problem
Review Formulation (Sample
Formulation
Design)

Report Generalization and


Data
Analysis and '
Data


Hypothesis
Preparation

Interpretation Testing
Collection

Fig. 1 . 9 a : Steps Involved In Research Process

Step 1: Research Problem Formulation

There are two types of research problems. Some research problems relate to states of

nature and others relate to the relationship between variables. If a research problem is

stated in a general way, then doubts or ambiguities, if any, relating to the problem will

be resolved. The feasibility of a final result is considered before the formulation of the

research problem.

Understanding the problem thoroughly and rephrasing the same into meaningful terms

from an analytical point of view are the two steps involved in the formulation of the

research problem. Initially, the problem can be stated in a broad, general way and

reframed into analytical or operational terms by rephrasing the problem in as specific

terms as possible.

Step 2: Literature Review

Literature review or survey is a collection of research publications, books, and other

supporting documents related to the defined problem. It helps in getting a proper

understanding of the problem chosen and to acquire proper theoretical and practical

knowledge to investigate the problem. It helps in the identification of various variables

to be considered for research. A literature review helps in assessing the current status of

the problem. After formulating a research problem, a brief summary should be written.

www.itmuniversityonline.org Page 16
Research Methodology

0 1 . Introduction to Research Methodology eBook

For example, for a research worker, writing a thesis for a Ph.D. degree or writing a

synopsis of the topic and submitting it to the Committee or Research Board is necessary

for a p p r o v a l .

Extensive literature survey that is concerned with the research problem is very

important. It can be made simple and easy by the abstracting and indexing of j o u r n a l s

and p u b l i s h ed or unpublished bibliographies. In this process, conference proceedings,

government reports, academic journals, books, etc. help a lot. Also, it should be noted

that one source will lead to another. At this level, a researcher can take the help of a

good library to make the task easy and simple.

Step 3: Hypothesis Formulation

Hypothesis formulation is the next step to a literature survey. The hypotheses should be

stated in clear and u n a m b i g u o u s terms. Hypothesis is nothing but an assumption made

in order to draw a conclusion about the population under consideration. Hypothesis

formulation is an important step because it provides the focal point for research. This is

the most crucial step in the analysis of data. It indirectly affects the q u a l i t y of data that

is required for the analysis. This is an important step in the development of research

problems. The hypothesis to be formulated must be very specific and limited because it

has to be tested in the analysis part.

Hypotheses are more specific predictions about the nature and direction of the

relationship between two variables. In a research, working hypothesis is a tentative

assumption made in order to draw and test its logical or empirical consequences.

Hypothesis is a resu It of the researcher's creativity.

Step 4: Research Design

Research design consists of sample design and methods for the collection of

measurement and a n a l y s i s of data. The research design must contain the details of the

defined sa m p l e and population, population and sample type, their size and their

probability distribution, definition, and details of variables considered under study. It

also contains procedures and techniques for data collection, the sa m p l e of research

population and method, and the technique to process and analyze the data.

www.itmuniversityonline.org Page 17
Research Methodology

0 1 . Introduction to Research Methodology eBook

Step 5: Data Collection

W h i l e selecting methods of collecting primary data, take into consideration the nature of

investigation, scope, and objective of the inquiry, available time, financial resources, and

the desired degree of accuracy. There are various methods for collecting primary d a t a :

By observation, through personal or telephonic interview, by observation, and through

questionnaire, schedules or any other data collection forms.

Step 6: Data Analysis and Hypothesis Testing

After data collection, the data is codified, tabulated, and analyzed for statistical

inferences. Through coding, the data is categorized and transformed in the form of

symbols for t a b u l a t i n g and counting the data.

Statistical values are obtained for this data and test for hypothesis is conducted by

applying tests, like c h i - sq u a r e , ANOVA, F-test, and many more. According to the testing

criteria, the d ec i s i o n is then taken to accept or reject the hypothesis.

Step 7: Generalization and Interpretation

Generalization involves two processes; theoretical inference from data, in order to

develop concepts and theory and the empirical application of the data to a wider

population, that is, building the theory based on research outputs. Interpretation refers

to the task of drawing inferences from the collected facts after an analytical and/or

experimental study for conclusion and for further research.

Step 8: Report Preparation

Report writing is a vital step in research, where the complete research and findings are

compiled together. A proper and valid report increases the efficiency of the research.

Acceptance and applicability of research depend on correct report writing. Due to

misleading conclusions about the research vitality, the whole research may be

questioned. Valid interpretations about the research can expose processes and relations

that u n d e r l i e its f i n d i n g s .

www.itmuniversityonline.org Page 18
Research Methodology

0 1 . Introduction to Research Methodology eBook

1.10 Role of Research in Business

The major role of research in business is to reduce the risk of the b u s i n e s s decision by

providing appropriate information regarding customers, competitors, market trends,

employees, government regulations, etc.

In a research process, the organization is able to obtain information about key business

areas, analyze it, develop strategies, and distribute business information.

Research has three major strategic roles in business decision m a k i n g :

To expand existing businesses, which includes:

o Improvements in the existing product service.

o Change in materials and technology for manufacturing and many more.

Exploring new business opportunities, which includes:

o Exploring new technology or product for new market.

o Entering into a new market with existing or new products.

Broadening and deepening technological capabilities, which include the same

product with improved or new technology.

www.itmuniversityonline.org Page 19
Research Methodology

0 1 . Introduction to Research Methodology eBook

1.11 Chapter S u m m a r y

Research can be summarized as a systematized effort to g a i n new knowledge.

Research method includes various procedures and techniques used for obtaining

and a n a l y z i n g data.

Research methodology is the approach toward systematically solving research

problems.

D e p e n d i n g on the perspective, there are various research types:

o Application Viewpoint: Pure and Applied Research

o Objective Viewpoint: Descriptive, Explorative, Explanatory, and

Co-relational Research

o I n q u i r y Mode Viewpoint: Quantitative and Qualitative Research

Research process gives a detailed flow of steps to be followed for any kind of

research.

For business, the major purpose of research is to reduce the risk of

decision-making.

www.itmuniversityonline.org Page 20
Research P r o b l e m

Formulation and

Research D e s i g n
Research Methodology

02. Research Problem Formulation and Research Design eBook

2.1 Introduction

Selecting and properly defining a research problem is the foremost step in the research

process. Problem identification and formulation are important terms in a research.

Identification of problem helps to define a research problem correctly. Some difficulty

that a researcher experiences, in context with either a theoretical or a practical

situation, is a research problem to which a researcher wants to obtain a s o l u t i o n .

In the words of Alison Loat, "A problem well-defined is a problem half-solved."

Without being clear of what you are going to research, it is troublesome to plan how you

are going to research it. You will be able to define your research strategy and data

collection methods by identifying a research problem clearly. The procedure of research

problem formulation will g u i d e you toward accurate research problem identification and

formulation.

You have to form the blueprint of how to conduct the research after defining a problem

as in the field of construction, architects with the help of a blueprint design, decide on

the efficient allocation of various resources. A blueprint for conducting a research is the

research design. It gives a detailed logical flow of the research approach and

respective methods of conducting those researches.

After reading t h i s chapter, you will be able to:

Define research problem

E x p l a i n the procedure of defining general research problem

Define the objectives of research problem

Define research design

E x p l a i n contents of research design

Explain important concepts in research design

E x p l a i n types of research design

Explain basic principles of experimental design

www.itmuniversityonline.org Page 22
Research Methodology

02. Research Problem Formulation and Research Design eBook

2.2 Definition of Research Problem

In the words of Z i k m u n d and Babin, "A problem is a situation, occurs when there

is a difference between the current conditions and most preferable set of

conditions."

Also, according to C. R. Kothari, "A research problem, in general, refers to some

difficulty which a researcher experiences in the context of either a theoretical

or practical situation and wants to obtain a solution for the same."

The process of defining and developing a decision statement and the steps involved in

translating it into more precise research terminology, including a set of research

objectives, can be referred to as 'research problem'.

Defining correct research problem guides for literature survey, selection of research

strategy, research design, selecting a data collection method and analysis method. Ill

defined research problems may create hurdles but a proper definition of research

problem will enable the researcher to be on track. Thus, defining a research problem

properly is a requirement for any study and is a highly important step. The formulation

of a problem is more important than its solution. It is only on the careful d e t a i l i n g of the

research problem that you can work out the research d e s i g n and can smoothly carry on

a l l the consequential steps involved while doing the research.

Factors to be considered while formulating a research problem:

An i n d i v i d u a l or an organization under consideration.

The environment or condition to which the difficulty pertains.

Specific objectives or goals to be attained.

Economic consideration: Research design efforts cost money. The value of

anticipated results must commensurate with the efforts put into the research, in

terms of benefit/returns.

Technical consideration: Adequate technical knowledge is available, with which

research is carried out.

Environmental consideration: Controversial subjects should not be chosen for

research. Preliminary studies or pilot surveys should be conducted after research

problem d e f i n i t i o n .

Consideration of limitations: Limitations, such as time limit, resource constraints,

and policy constraints are to be considered.

www.itmuniversityonline.org Page 23
Research Methodology

02. Research Problem Formulation and Research Design eBook

2.3 Procedure of Defining General Research Problem

The t e c h n i q u e of defining the general research problem involves the following steps:

1. Defining the Problem in a General Way

In a research problem, you must address either a specific practical operational

issue or some scientific discovery. Keeping in view some practical concern or some

scientific or intellectual interest, all the problems should be stated in a broad,

general way. Hence, to formulate a problem, researchers must involve themselves

absolutely in the subject matter.

In social research, it is advisable to do some field observation and undertake pilot

survey. Then, the researcher can seek the guidance of the guide or the subject

expert, in accomplishing this task. The guide puts forth the problem in general

terms, and it is then up to the researcher to narrow it down and phrase the

problem in operational terms. The problem stated in a general way may contain

ambiguities. Such problems must be resolved by cool thinking and rethinking.

2. Understanding the Nature of the Problem

Discuss the problem to understand it in a better way. Researchers should consider

all the points that induced them to make a general statement concerning the

problem. They can enter into discussion with those who have good knowledge of

the problem concerned or similar other problems, for a better u n d e r s t a n d i n g of the

nature of the problem involved.

3. Literature Survey

Review a l l the possible literature that is available on the research area and give a

new d i m e n s i o n in the particular area that leads to the enhancement of knowledge.

Before a definition of the research problem is given, all available literature

concerning the problem at-hand must necessarily be surveyed and examined.

The researcher must be well-conversant with relevant theories in the field, reports,

and records premise. For indicating the type of difficulties that may be

encountered in the current study, as also the possible analytical shortcomings

studied on related problems are useful. At times, such studies may also suggest

beneficial and even new lines of approach to the present problem.

www.itmuniversityonline.org Page 24
Research Methodology

02. Research Problem Formulation a n d Research Design eBook

4. Developing Ideas through Discussions

Take the advice of experienced researchers, to develop different aspects of the

research pro b l e m . Researchers can discuss the problem with colleagues and others

who have e n o u g h experience in the same area or in working on similar problems.

A discussion of the concerning problem often leads to the generation of useful

information. Discussions can develop new ideas; people with rich experiences are

in a position to enlighten the researcher on different aspects of the proposed

study.

5. Rephrasing the Research Problems

Put the research problem in 'as specific terms as possible', so that it may become

operationally viable for hypothesis development. Researchers can frame the

research problem more appropriately, to get good results.

Business Research Problem Definition Process

In case of bus i nes s research problem definition, the process can be shown as illustrated

in Fig. 2.3a.

6. Write research questions and/or research

hypotheses

I 5. Determine the relevant variables

1 4 . Determine the unit of analysis

3. Write managerial decision statement a n d

corresponding research objectives

2. Identify the problems from the symptoms

1. Understand the situation - Identify the key

symptoms

Fig. 2.3a: Business Research Problem Definition Process

Source: Zikmund, Babin, Carr, Griffin, Business Research Methods, 8th Edition

www.itmuniversityonline.org Page 25
Research Methodology

02. Research Problem Formulation and Research Design eBook

1. Understand the situation: Identify the key symptoms

Identify the symptoms: ask 'what has changed?'

The symptoms can be: decline in sales, increase in the cost of recruitment,

etc.

2. Identify the problems from the symptoms

Relate the symptoms with various possible reasons or causes. For example, a firm

has a problem with advertising effectiveness; the causes can be low brand

awareness, using wrong media, etc.

3. Write research objectives corresponding to managerial decision

statement

Decision statement explains how a problem can be solved.

It e x p l a i n s the information that is needed to help make the d e c i s i o n .

Research objective expresses potential research results that should aid

decision m a k i n g .

4. Determine the unit of analysis

The unit of analysis for a study indicates what or who should provide the

data and at what level of aggregation. For example, it can be the target

population from whom data needs to be collected to serve the research

objectives.

In a study of home appliances, the data is gathered from married couples.

s. Determine the relevant variables

A variable is anything that varies or changes from one instance to another.

It can also exhibit the difference between the values or directions.

The determination of the type of items or variables is very essential in a

research and should be studied to address the decision statement.

6. Write research questions a n d / o r research hypotheses

Research questions simply restate research objectives in the form of a q u e s t i o n .

www.itmuniversityonline.org Page 26
Research Methodology

02. Research Problem Formulation a n d Research Design eBook

2.4 Objectives of Research Problem

The research discovers answers to questions using the application of scientific methods.

According to C. R. Kothari, research objectives can be divided into the following broad

groups:

To g a i n familiarity with a phenomenon or to achieve new i n s i g h t s into it, known as

exploratory or formulative research studies.

To portray, accurately, the characteristics of a particular individual, situation or a

group, known as descriptive research studies.

To determine the frequency with which something occurs or with which it is

associated, known as diagnostic research studies.

To test the hypothesis of a causal relationship between variables, known as

hypothesis-testing research studies.

Formulating Hypothesis

A hypothesis states the relationship between two or more variables that suggest

an answer to the research question.

It predicts the relationship, in terms of expected results or outcomes of a study.

The direction of relationship between dependent and independent variables are

also predicted.

The suggestion formulated in the hypothesis may be the solution to the problem.

Two types of hypothesis are:

Null Hypothesis ( H o )

Null hypothesis predicts that in a general population, no relationship or no significant

difference exists between groups of a variable.

Alternative Hypothesis ( HA)

Alternative hypothesis is just the opposite of null hypothesis; it states that there is

significant difference or relationship between the groups or variables that can be tested.

For an advertising strategy, a firm is interested in evaluating the effectiveness of

promoting a product by internet and television.

Ho: There is no significant difference between the effectiveness of promotion by internet

and television.

HA: Both the advertising m ed i a have significant difference, in terms of effectiveness.

Or

HA: Promotion by Internet is more effective than television.

www.itmuniversityonline.org Page 27
Research Methodology

02. Research Problem Formulation and Research Design eBook

2.5 Research D e s i g n

In the words of Claire Selltiz, "A research design is the arrangement of

conditions for collection and analysis of data in a manner that aims to combine

relevance to the research purpose with economy in procedure."

A research design is a framework or a blueprint for conducting research. It gives the

procedures and is useful for obtaining the information needed to structure or solve the

research problems. It is a decision matrix which looks into the aspects of SWH - what,

where, when, which, and how, as they pertain to a research procedure.

Research design facilitates the logical flow of research operations. It gives the concise

plan of logical relations between the research type, data required, data collection and

analyzing method, and research reporting method. It helps to yield maximum

information, with m i n i m a l expenses for effort, time, and money.

2.6 Contents of Research Design

The contents of research d e s i g n are:

Sampling Design

It deals with the method of selecting items to be observed for the g i v e n research type.

Observational Design

It describes the c o n d i t i o n s under which the observations are to be made.

Statistical Design

The statistical design that is concerned with the question of how many items are to be

observed and how the information and data gathered are to be analyzed.

Operational Design

Operational design deals with techniques by which the procedures specified in the

s a m p l i n g , observational, and statistical design are satisfied.

Source: Claire Selltiz and others, Research Methods in Social Sciences, 1 9 6 2 , p. 50.

www.itmuniversityonline.org Page 28
Research Methodology

02. Research Problem Formulation a n d Research Design eBook

2.7 I m p o r t a n t Concepts in Research Design

Concepts rn Research Design

Experimental
Variables Experiment Control Group
Group

Fig. 2 . 7 a : Important Concepts in Research Design

Variables

Variable is a concept that can take on different quantitative values. A variable is

anything that varies or changes from one instance to another. It can e x h i b i t differences

in value or direction. Example: Concepts like weight, height, and income, which vary

from individual-to-individual, randomly.

Continuous Variable

It is a variable that can take any value, even a decimal value, between its minimum

value and m a x i m u m value. Example: Recording the temperature of a city.

Discrete Variable

It is a variable that takes only integer value. Example: Count of c h i l d r e n in family. The

relationship between the variables is described according to dependency on each other.

Dependent Variable

Dependent variable is a process outcome or variable that is predicted and/or explained

by other variables.

Independent Variable

Independent variable is a variable that is expected to influence the outcome of the

dependent variable, in some way. For example, customer loyalty may be a dependent

variable that is influenced or predicted by independent variables, such as service q u a l i t y ,

brand awareness or customer satisfaction.

www.itmuniversityonline.org Page 29
Research Methodology

02. Research Problem Formulation and Research Design eBook

Extraneous Variable

Extraneous variables are independent variables that are not related to the purpose of

the study but may affect the dependent variable. It is not under the control of the

researcher.

Experiment

The process of examining the truth of a statistical hypothesis (Ho), relating to some

research problem, is known as an experiment. The purpose of an experiment is to study

the causal links, whether a change in one independent variable produces a change in

another d e p e n d e n t variable.

Major Terms Used in Experiments

Treatments

The different conditions under which experimental and control groups are tested are

referred to as treatments.

Experimental (Sampling) Units

These are pre-determined plots or blocks where different treatments are used; always

specifically define those units in research design.

Control and Experimental Groups

In a classic experiment, two groups are established and certain members are assigned

to each group. The two groups will be exactly similar in all aspects relevant to the

research. When the g r o u p is exposed to some novel or special condition by intervention

or m a n i p u l a t i o n of independent variables, it is an experimental g r o u p . When the g r o u p is

exposed to a u s u a l condition, it is a control group.

Control Group ! Experimental Group

i
Group members assigned at random

Dependent variable is measured

Manipulation of

j independent variable

i
Dependent variable is measured

Fig. 2 . 7 b : Control and Experimental Groups

www.itmuniversityonline.org Page 30
Research Methodology

02. Research Problem Formulation and Research Design eBook

2.8 Types of Research Design

Different research d e s i g n s c a n be categorized, depending on research approaches:

Exploratory Research Design

Exploratory research, also termed as formulative research studies, is the v a l u a b l e means

of f i n d i n g out:

What is h a p p e n i n g

New insights

The questions to be asked

The assessment of phenomenon in a new light

To conduct the exploratory research, three principles are u s ed :

A search of the literature

Interviewing experts in subject

Conducting focus group interviews

Exploratory research design can be obtained by the following methods:

Literature Survey

The literature survey method is one of the simplest and most fruitful methods of

formulating precisely the research problem and developing hypotheses.

Hypotheses of earlier researchers are evaluated as the basis of further research.

Experience Survey

Experience survey means a survey of people who have had practical experience

with the problem to be studied. For such a survey, people who are competent and

can contribute new ideas may be carefully selected as respondents, to ensure a

representation of different types of experience. It may enable the researcher to

define the problem more concisely and help in the formulation of the research

hypothesis.

Focus Group Interview

It is an unstructured, free-flowing interview with a small group of six to ten

people. Focus groups are led by a trained moderator, who follows a flexible

format, encouraging dialogue among respondents.

www.itmuniversityonline.org Page 3 1
Research Methodology

02. Research Problem Formulation and Research Design eBook

Descriptive and Diagnostic Research Design

Descriptive is concerned with describing characteristics of a particular individual or

group. Diagnostic is concerned with determining the frequency with which something

occurs or its association with something else.

The procedure for descriptive/diagnostic research is:

1. Formulating the objective of the study: What the study is about and why is it

being made?

2. Designing the methods of data collection: What techniques of g a t h e r i n g data will

be adapted, such as observation, questionnaires, etc.

3. Selecting the sample: How much material will be needed? From which population

is the sample taken?

4. Collecting data: Where can the required data be found and with what time period

should the data be related?

5. Process a n a l y z i n g the data.

6. Reporting the f i n d i n g s .

Hypothesis Testing Research Design

It is concerned with the testing of hypothesis for the causal relationships between

variables and helps in drawing inferences about the causality. Testing of hypothesis

employs statistical procedures, in which the inferences about the target population are

drawn from a study sample. Experimental design is the method for conducting the

hypothesis testing method. While conducting hypothesis testing research, three basic

principles of experimental design are to be followed for improving the accuracy of

research inferences about dependent variables.

The p r i n c i p l e s for experimental design are:

P r i n c i p l e of replication

P r i n c i p l e of randomization

P r i n c i p l e of local control

www.itmuniversityonline.org Page 32
Research Methodology

02. Research Problem Formulation and Research Design eBook

2.9 B a s i c P r i n c i p l e s of Experimental Design

Principle of Replication

The research design should be such that the experiment can be repeated more than

once. Each treatment is applied in many experimental units, instead of one. Due to

repetitive experiments, statistical accuracy in the estimation of the variable relationship

is increased.

Principle of Randomization

Randomization is the random assignment of treatments to the experimental or sample

units. The research design should be planned such that while experimenting, the

variations caused by the extraneous variables/factors can all be c o m b i n ed under the

general heading of 'chance'. Thus, the members of experimental or sample u n i t s are to

be selected in random manner.

Principle of Local Control

Randomization and replication do not remove all the extraneous sources of variation,

experimental errors still remain but are unknown. Local control refers to the g r o u p i n g of

the experiment units in such a way that the units within the group are more

homogenous than units in other groups. Then the randomized treatment is assigned to

these parts of blocks. Dividing the samples into various homogenous parts is known as

blocking. Blocking is done in such a way that variation due to the extraneous variable

remains fixed. U n d e r t h i s principle, the design is planned such that the v a r i a b i l i t y d u e to

the extraneous va r i a b l e can be measured, in order to reduce experimental errors.

www.itmuniversityonline.org Page 33
Research Methodology

02. Research Problem Formulation and Research Design eBook

2.10 Chapter S u m m a r y

Formulation of the research question and stating the hypothesis are key

p r e l i m i n a r y steps in the research process.

The research question or a research problem statement presents the idea that is

to be examined in the study and is the foundation of the research.

The final research question consists of a statement about the relationship of two or

more variables.

A hypothesis is a declarative statement about the relationship between two or

more variables that predicts an expected outcome.

Research d e s i g n is a framework or blueprint for conducting research.

www.itmuniversityonline.org Page 34
Sampling Design

and Sampling

Techniques
Research Methodology

03. S a m p l i n g Design and S a m p l i n g Techniques eBook

3 . 1 Introduction

If researchers want to discover the most pressing financial problems faced by the people

in general, varying from low wages to raising health care, housing costs, etc., they have

to ask everyone for t h e i r opinions. However, due to economical and time constraints, it

is not possible to question every person.

Representative small groups can be selected from the general population for research

and analysis within the time frame for the required data. Such a grouping is known as

s a m p l i n g and how such grouping is done is known as sampling techniques.

The collection of a l l the observations under consideration is known as a 'population.' A

complete enumeration or study of all items in the 'population' of a universe, where

population is a subset of the universe, is known as a census study or census i n q u i r y .

Often, it is not possible to study each and every observation in the population due to

time, money, and many other constraints. In that case, a fraction of that population is

studied and it is known as a sample study or sample survey, that is, the part of the

population taken for study is the sample.

After reading t h i s chapter, you will be able to:

Define p o p u l a t i o n , census, and sample

Explain sampling d e s i g n and its procedure

E x p l a i n the criteria for selecting a sampling procedure

E x p l a i n the various types of sampling techniques

www.itmuniversityonline.org Page 36
Research Methodology

03. S a m p l i n g Design and S a m p l i n g Techniques eBook

3 . 2 P o p u l a t i o n , Census, a n d Sample

Population

Population is any complete group of entities that share some common set of

characteristics, about which inferences are to be made. Population characteristics are

known as parameters. For example, all registered voters in India or a l l members of the

international teachers u n i o n .

Census

Census is an investigation of all the individual elements that make u p a p o p u l a t i o n and a

total enumeration, rather than a sample. In a census, the survey investigator studies

the characteristics of each and every entity in the population. For example, the census

of I n d i a is a big source of a variety of statistical information, which i n c l u d e s the different

characteristics of the people. This census data is collected once every 10 years.

Sample

Sample is a subset or some part of a population, used to make inferences about the

whole population, as shown in Fig. 3.2a. Sampling involves the process of selecting a

number of representative study units from a defined study population. Sample

characteristics are known as statistics.

Sample

Population

Fig. 3 . 2 a : Population and Sample

www.itmuniversityonline.org Page 37
Research Methodology

03. S a m p l i n g Design and S a m p l i n g Techniques eBook

3.3 Sampling Design

A systematic plan for obtaining a sample from a given population is known as a sample

design. It refers to the technique or the procedure that the researcher should adopt in

selecting items for the sa m p l e . Three decisions have to be considered while designing a

sample:

Who w i l l be surveyed? - S a m p l e :

Determine what type of information is needed and who is most likely to have it.

How many w i l l be surveyed? - Sample size:

Large samples give more reliable results than small samples.

How should a sample be chosen? - Type of S a m p l i n g :

S a m p l e members may be chosen at random from an entire p o p u l a t i o n , also known

as ' p r o b a b i l i t y s a m p l i n g . '

S a m p l e members or u n i t s might be chosen as per the requirement in general or as per

the convenience of the researcher, also known as 'non-probability sa m p l i n g ' or

'judgmental s a m p l i n g . '

3.4 S a m p l e D e s i g n Procedure

The s a m p l i n g design procedure that should be considered while selecting a sample i s :

1. Selecting Target Population

The target population should have the characteristics, about which inferences are

to be drawn. For consumer-related surveys, the appropriate population elements

frequently used are households. The population can be finite or infinite, d e p e n d i n g

on the certainty of the number of members. If numbers are known, such as

population of a city or the number of workers in a factory, then the population is

finite. If the count is not known, such as listeners of a specific radio program, then

the population is said to be infinite.

Selecting target population (or the set of objects, technically called the Universe,

to be s t u d i ed ) is the first step in developing any sample d e s i g n . Depending on the

n u m b e r of items in the universe, it can be finite or infinite. In a finite universe, the

n u m b e r of items is certain. However, in case of an infinite universe, the n u m b e r of

items is infinite.

www.itmuniversityonline.org Page 38
Research Methodology

03. S a m p l i n g Design and S a m p l i n g Techniques eBook

T h e population of a town and the number of people in that town are examples of a

finite universe. The number of fishes in the sea, viewers of a specific TV serial

programme, etc. are examples of an infinite universe.

2. Select a S a m p l i n g Frame

A complete list of all cases in the population from which the sample w i l l be drawn,

is known as the sampling frame. An investigator has to take decisions concerning

a sa m p l i n g u n i t before selecting a sample.

Sampling u n i t s may be geographical, such as village, state, district, etc., may be a

social unit, such as school, club, family, etc. or it may be an individual. The

researcher w i l l have to decide one or more of such units that he has to select for

h i s study.

For exam pie, if the research objective is concerned with the members of the sports

club, then the sa m p l i n g frame will contain a complete list of individuals who are

members of that c l u b .

3. Determine if Probability or Non-probability Sampling Method will be

Chosen

In probability sampling, every element in the population has a known, non-zero

p r o b a b i l i t y of selection. The simple random sample, in w h i c h each member of the

population has an equal probability of being selected, is the best-known probability

sample. In non-probability sampling, the probability of a n y particular member of

the population being selected is unknown. The selection of s a m p l i n g units in non

probability sa m p l i n g is quite arbitrary, as researchers rely heavily on personal

judgment.

4. Determine Sample Size

This refers to the n u m b e r of items or units to be selected from the population, in

order to constitute samples. An optimum sample to be considered should satisfy

the requirements of efficiency, representativeness, reliability, and flexibility. While

deciding this, size, budgetary, and time constraints are to be considered.

Sample size is nothing but the total number of units to be selected from the

universe, in order to form a sample.

www.itmuniversityonline.org Page 39
Research Methodology

03. S a m p l i n g Design and S a m p l i n g Techniques eBook

One of the main problems for the researcher is in the sa m p l e size selection. This

sample size should not be too large or too small, that is, the sa m p l e size must be

optimal. Optimal size samples can easily fulfil requirements, such as reliability,

representativeness, efficiency, and flexibility. At the time of the determination of

the sample size, the researcher must determine the desired precision to be

achieved and also, an acceptable confidence level for the estimate. The value of a

sample size d e p e n d s on the population size.

5. Parameters of Interest

While determining the design of a sample, one must note the question of the

specific p o p u l a t i o n parameters as they are essential.

For example, you may be interested in estimating the proportion of students with

some characteristic in the population or you may be interested in knowing some

average or another measure concerning the population.

6. Budgetary Constraint

Costs involved in the total sampling procedure have a great impact on decisions

relating to the size, as well as, to the type of sample.

7. Sampling Procedure

The researcher must decide about the technique to be used in choosing items for

the sample. This is a part of the sample design itself. There are several sample

designs, out of which the researcher must choose one for h i s study. Obviously, he

must select the design which, for a given sample size and cost, has a smaller

s a m p l i n g error.

www.itmuniversityonline.org Page 40
Research Methodology

03. S a m p l i n g Design and S a m p l i n g Techniques eBook

3.5 Characteristics of a Good Sample Design

Some characteristics of a good sample are:

It must be a good representative of the population.

Its d e s i g n must result in a small sampling error.

Its design must be applicable in the context of funds available for the research

study.

S a m p l e d e s i g n must be able to control systematic bias in a better way.

Sample results of the sample study should be applicable g e n e r a l l y in the universe,

with a reasonable level of confidence.

Source: C. R. Kothari, Research Methodology: Methcx:ls and Techniques, New Age International Publishers,

2nd Edition

3.6 Criteria for Selecting a S a m p l i n g Procedure

There are two types of costs involved in sampling analysis: the cost of an incorrect

inference, resulting from incorrect data and the cost of collecting the data. Incorrect

inferences are gathered d u e to systematic bias and sampling error. Error in the sa m p l i n g

procedures results in systematic bias, which cannot be eliminated or reduced by

increasing sa m p l e size. One can detect and correct the causes responsible for these

errors.

A systematic bias is the result of one or more of the following factors:

1. Inappropriate s am pli ng frame

If the sampling frame is inappropriate, that is, a biased representation of the

universe, it w i l l result in a systematic bias.

2. Defective measuring device

If the measuring device is constantly in error, it will result in systematic bias. In

survey work, systematic bias can result if the questionnaire or the interviewer is

biased.

3. Non-respondents

If you are unable to sample all the individuals initially included in the sample,

there may arise a systematic bias.

www.itmuniversityonline.org Page 4 1
Research Methodology

03. S a m p l i n g Design and S a m p l i n g Techniques eBook

4. Indeterminacy principle

Sometimes, i n d i v i d u a l s act differently, when kept u n d e r observation, than they do

when kept in non-observed situations. Thus, the indeterminacy p r i n c i p l e may also

be a cause of systematic bias.

5. Natural bias in the reporting of data

Natural bias of respondents in the reporting of data is often the cause of a

systematic bias in many inquiries. There is usually a downward bias in the income

data collected by the government taxation department; whereas, there is an

upward bias in the income data collected by social organizations. People in gereral

understate t h e i r incomes if asked about it for tax purposes, but they overstate the

same if asked for social status or their affluence.

Sampling Errors

Sampling errors are the random variations in sample estimates. Sampling error

decreases with the increase in the size of the sample. It can be measured for a given

sample design and size. The measurement of sampling error is usually called the

'precision of the sampling plan'. If we increase the sample size, the precision can be

improved. Thus, the effective way to increase precision is, usually, to select a better

sampling design, which has a smaller sampling error for a given sample size, at a given

cost.

Sampling errors are the random variations in the sample estimates around the

true population parameters.

Sampling error decreases with the increase in sample size.

Thus, a major criterion while selecting a sampling procedure is to ensure that the

procedure causes a relatively small sampling error and helps to control the systematic

bias in a better way.

Source: C.R. Kothari, Research Methodology: Methods and Techniques, New Age International Publishers,

2nd Edition

www.itmuniversityonline.org Page 42
Research Methodology

03. S a m p l i n g Design and S a m p l i n g Techniques eBook

3.7 Types of S a m p l i n g Techniques

Different types of sample designs depend on two factors, namely, the representation

basis and the element selection technique. In representation basis, the sa m p l e selected

may be by probability sampling or non-probability sa m p l i n g . Probability sampling is

based on the concept of random selection. Non-probability sampling is 'non-random'

sampling. In an unrestricted sampling procedure, each sample element is selected

individually, from the population under consideration at large. All other forms of

sampling are covered u n d e r the term 'restricted sampling'.

The major types of sampling techniques are shown in Fig. 3 . 7 a .

Non-probability Probability

Sampling Techniques Sampling Techniques

Convenience . Simple Random


.
Sampling Sampling

Judgment . Systematic
.
Sampling Sampling

Quota . Stratified
.
Sampling Sampling

. Snowball Cluster
.
Sampling Sampling

Fig. 3 . 7 a : Types of Sampling Techniques

3.7.1 Non-probability Sampling Techniques

It is also known as j u d g e m e n t , deliberate or purposive sampling. In this method, the

researcher selects the items deliberately, that is, in such a sampling technique, the

researcher purposely chooses particular units of the population to constitute a sample

that w i l l represent the whole population. Some of the types of n o n - p r o b a b i l i t y sa m p l i n g

are convenience s a m p l i n g , judgment sampling, quota sampling, and snowball s a m p l i n g .

In this method, the results selected by the investigator are favourable to his point of

view, so that the entire i n q u i r y may get vitiated. There is always a serious bias entering

into t h i s type of s a m p l i n g technique.

www.itmuniversityonline.org Page 43
Research Methodology

03. Sampling Design and S a m p l i n g Techniques eBook

Convenience S a m p l i n g

In this type of sampling, the researcher selects units of the population most

conveniently, in order to form a sample. Hence, this method is referred to as

convenience sampling. This technique gives a sample of convenient elements. Only

respondents that happen to be in the right place, at the right time, get selected in the

samples. For exploratory research, convenience samples are best used when additional

research w i l l , subsequently, be conducted with a probability sample.

Judgment Sampling

Another type of non-probability sampling is judgmental (purposive) sa m p l i n g , in w h i c h a

researcher selects the units of the sample, based on their judgment, about some

appropriate characteristics required. In case study research or in a case where research

is informative, this method is often used when working on very small samples. The

samples selected by this method satisfy specific purposes of research but will not fully

represent the population. This sampling technique is used to obtain information from a

very specific g r o u p of people.

Quota S a m p l i n g

Quota sampling is one of the non-probability sampling procedures, in which various

s u b g r o u p s of a population will be represented on the basis of pertinent characteristics to

the exact extent that the investigator desires. Quota sampling is a two-stage, restricted

judgmental sampling during which, in the first stage, the population is divided into

various groups and a quota must be calculated for each group, depending on relevant

and available data. In the second stage, sample elements from quota groups are

selected based on convenience or judgment sampling.

This s a m p l i n g method is usually used for interview and survey methods. For example, in

a particular city, an interviewer takes 100 interviews to find the m o n t h l y expenditure of

the city. The interviewer selects the sample with 10/o of high class, 60% of middle

class, 10% of lower middle class, and 20/o of the rest according to the quota assigned

to each g r o u p .

Snowball S a m p l i n g

A snowball sampling is a non-probability sampling procedure, in which initial

respondents are selected by probability methods or randomly and additional

respondents are obtained from the information provided by the initial respondents.

www.itmuniversityonline.org Page 44
Research Methodology

03. S a m p l i n g Design and S a m p l i n g Techniques eBook

First, a group of respondents are selected randomly and then, subsequent respondents

are selected, based on referrals made by the previous g r o u p .

This technique is usually used to locate members of a rare population and also, the

cases for which identifying population is very difficult. The error of systematic bias

occurs frequently with such sampling. For example, people can claim unemployment

benefits by hiding information about their employment.

3 . 7 . 2 Probability S a m p l i n g Techniques

It is also known as 'chance' or 'random' sampling. In this method, each item of the

population has an equal chance of being included in the sa m p l e . For example, in a

lottery method, individual units are picked up from the whole group, using some

mechanical process. The results obtained from random or probability sampling can be

measured in terms of probability. Some of the types of probability sampling are simple

random s a m p l i n g , systematic sampling, stratified sa m p l i n g , and cluster s a m p l i n g .

Simple Random Sampling

A simple random sa m p l e is a sample of size n, drawn from a population of size N, in

such a way, that the u n i t in every possible sample of size n has an e q u a l chance of being

selected, that is, simple random sampling is one of the types of probability sa m p l i n g

procedure that assures each element in the population will have an equal chance of

being included in the sa m p l e . S i m p l e random sampling is best used in case an accurate

and easily accessible sampling frame that lists the entire population, is available,

preferably stored on a computer. For a small sample size, methods like drawing

numbers or names from a fishbowl or using a spinner is appropriate. However, for a

large sample size, random number generation techniques are applied in obtaining

sample u n i t s .

A simple random sample is a subset of units selected from a population. Each unit is

selected randomly and entirely by chance, such that each unit has the same chance or

p r o b a b i l i t y of being selected at any stage during the sa m p l i n g process. Each subset

of R u n i t s has the same probability of being selected for the sample, as a n y other subset

of R u n i t s . This technique and process is known as simple random sampling. A simple

random sample is one of the unbiased surveying methods.

Suppose N people want to get a ticket for a movie but there are only X tickets

where X < N. So, the authority decides to distribute the ticket among the people,

www.itmuniversityonline.org Page 45
Research Methodology

03. S a m p l i n g Design and S a m p l i n g Techniques eBook

without any bias. Then, everyone is given a number in the range from O to N - 1 and

random numbers are drawn, either from a table of random numbers or electronically,

with the help of computers. Numbers between the ranges O to N - 1 are considered,

ignoring any n u m b e r s previously selected. The first X numbers would get the X ticket.

In small or large size populations, this type of sampling is typically done 'without

replacement', in which one avoids selecting any individual of the population more than

once. Instead of this, simple random sampling can be carried out with replacement.

For a small sample from a large size population, sampling without replacement is

approximately the same as sampling with replacement, since the odds of selecting the

same i n d i v i d u a l twice is low.

Systematic S a m p l i n g

A systematic sa m p l i n g is one of the types of probability sampling in w h i c h a starting unit

1h
is selected by a random process and then, every i number on the list is selected for

subsequent sa m p l e members. The sampling interval 'i' is obtained by dividing the

population size N by the sample size 'n' and rounding to the nearest integer. When the

ordering of the elements is relegated to the characteristics of importance, systematic

sampling increases the representativeness of the sample.

For example, there are 100,000 individuals in the population and a sample of 1,000 is

required. In this case, the sampling interval, 'i', is 100. Now, select a random number

between 1 and 100. If this number is 23, then the sample consists of elements 23, 123,

223, 323, etc.

Stratified S a m p l i n g

A stratified sampling is a probability sampling procedure, in which simple random

subsamples that are more or less equal in some characteristic are drawn from within

each stratum of the population. Elements in different strata should be as heterogeneous

in nature, as possible.

www.itmuniversityonline.org Page 46
Research Methodology

03. S a m p l i n g Design and S a m p l i n g Techniques eBook

There are two primary reasons for using stratified sa m p l i n g : the sa m p l e will be more

representative of the population and it ensures a specific number of individuals are

selected from each category.

00000 000
ooooo

o o . t>. 00000 ::::J DODD

o o c:. o .
. ... .... .
.o.c:.

.0. /::,,
.................. ... ...............
Population Strata Sample

Fig. 3 . 7 . 2 a : Stratified Sampling

Stratified sampling technique is generally applied, in order to obtain a representative

sample, in case a p o p u l a t i o n , from which a sample is to be drawn, does not constitute a

homogeneous group. In this method, the population is d i v i d ed into several sub

populations that are individually more homogeneous than the total population. These

different sub-populations are called 'strata'. Items should be selected from each

stratum, in order to constitute a sample. Variation within each stratum is much less than

that of a p o p u l a t i o n . Precise estimates for each stratum are computed and t h u s , a better

estimate of the p o p u l a t i o n , as a whole, is derived.

Stratified sampling gives more detailed and reliable information. The following three

questions are h i g h l y relevant in the context of stratified sa m p l i n g :

a) How to form strata?

b) How should items be selected from each stratum?

c) How many items should be selected from each stratum or how to allocate the

sample size of each stratum?

There are two methods of stratified sampling:

Proportionate Stratified Sampling

In t h i s s a m p l i n g , the size of the sample drawn from each stratum is proportionate

to the relative size of that stratum in the total population.

Disproportionate Stratified Sampling

In t h i s s a m p l i n g , the size of the sample from each stratum is proportionate to the

relative size of that stratum and to the standard deviation of the d i s t r i b u t i o n of the

characteristic of interest among all the elements in that stratum.

www.itmuniversityonline.org Page 47
Research Methodology

03. S a m p l i n g Design and S a m p l i n g Techniques eBook

As variability increases, sample size must increase, in order to provide accurate

estimates. Hence, such sampling techniques are used. To increase sample efficiency, the

strata having large variability are sampled more heavily, that is, to produce smaller

random s a m p l i n g error.

Cluster S a m p l i n g

Cluster sampling is one of the types of a probability sa m p l i n g procedure, in which the

population of interest are divided into representative "clusters" of i n d i v i d u a l s .

Clusters themselves should be as homogeneous as possible, so that each cluster should

be a small-scale representation of the population itself. If the researcher cannot get a

complete list of the members of the population, then cluster sampling is conducted

because in that case, it is impossible or impractical to draw a simple random sample or

stratified sample. In the method of cluster sampling, grouping the population units is

done and then, selecting the groups or the clusters, instead of individual elements, for

inclusion in the s a m p l e .

In cluster sampling, it is necessary to divide the total area into a number of smaller,

non-overlapping areas, which are known as clusters. After forming appropriate clusters,

a n u m b e r of these clusters are randomly selected, so that all units in these s m a l l areas,

that is, clusters, get selected in the sample.

www.itmuniversityonline.org Page 48
Research Methodology

03. S a m p l i n g Design and S a m p l i n g Techniques eBook

3 . 8 Chapter S u m m a r y

The collection of all the observations under consideration is known as a

' P o p u l a t i o n . ' A complete enumeration or study of all items in the ' p o p u l a t i o n ' of a

universe is known as a census study or census inquiry.

Sample is a subset or a part of the population that is used to make inferences

about the whole population. Sampling involves the process of selecting a number

of representative study units from a defined study population. Population

characteristics are known as parameters.

A definite plan for obtaining a sample from a given population is known as a

sample d e s i g n .

Procedures for sample design are:

o Select the target population

o Select the sa m p l i n g frame

o Determine if probability or non-probability sampling method will be chosen

o Determine sa m p l e size

Two major costs involved in a sampling analysis are:

o Cost of collecting data

o Cost of an incorrect inference resulting from the data

The major causes of incorrect inferences are:

o Systematic Bias

o Sampling Errors

www.itmuniversityonline.org Page 49
Methods a n d

T o o l s of D a t a

Collection
Research Methodology

04. Methods and Tools of Data Collection eBook

4 . 1 Introduction

In real life situations, we deal with different types of data, which are nothing but values

of q u a l i t a t i v e or q u a n t i t a t i v e variables related to a particular set of items.

Data collection is a process of preparing and collecting data. The main purpose of data

collection is to obtain information to make decisions, to keep a record about important

topics and also, to pass information to others. Primarily, collected data provides

information regarding a specific topic.

While dealing with any real life problem, sometimes, it is discovered that the data at

hand is inadequate; it, then, becomes essential to collect data that is a p p l i c a b l e . There

are several ways of collecting appropriate data, which differ considerably, in context with

time, costs, and other resources, at the disposal of the researcher.

After reading t h i s chapter, you will be able to:

E x p l a i n the types of d a t a : primary and secondary data

Describe the different methods used to collect data

Discuss the merits and demerits of various data collection methods

Explain how to collect data through a questionnaire

E x p l a i n the m a i n aspects of a questionnaire

Illustrate how to d e s i g n a questionnaire

Discuss the requirements of a good questionnaire

E x p l a i n the method of case study

Describe the characteristics of a case study

Discuss the a s s u m p t i o n s of a case study

Describe advantages and limitations of a case study

www.itmuniversityonline.org Page 5 1
Research Methodology

04. Methods and Tools of Data Collection eBook

4. 2 Data Types

Data collection is the next step to defining and planning research problems. There are

various methods of data collection. The type of method used for the collection of data

d e p e n d s on the type of data to be collected. There are two types of data, primary and

secondary data. Data collected afresh, for the first time, is known as primary data, w h i l e

secondary data are those which have already been collected by another person and

which have already passed through the statistical process.

Researchers have to decide which type of data needs to be collected first for the study

and then, they can select the method of data collection. The methods used to collect

primary data are different than the method of secondary data collection. Facts and

statistics collected together for reference or analysis is known as data. Data is values of

qualitative or quantitative variables, belonging to a set of items. Data in processing is

represented in a structure, often tabular, a graph or a tree structure. Data is typically

the resu It of measurements and can be visualized using g r a p h s or images.

The process of gathering and measuring information on variables of interest is known as

data collection. This process can be established in a systematic fashion that enables one

to answer the stated research questions, evaluate outcomes, and test hypotheses.

The term data collection is used commonly, in the research of all fields of study,

including physical and humanities, business, social sciences, etc. W h i l e methods vary by

discipline, the significance on ensuring accurate and honest collection remains the same.

Accurate data collection is a very important step, rather t h a n defining data (qualitative

or quantitative) to maintain the integrity of research.

The two types of data are explained below.

4.2.1 Primary Data

In case of experimental research, you can collect primary data during the course of

conducting the experiment. However, in case of descriptive type of research, primary

data can be collected through observation or through direct communication with

respondents t h r o u g h personal interviews. Data which is collected afresh and for the first

time, w h i c h also h a p p e n s to be original in character, is known as primary data. It is data

in which information is obtained directly from first-hand sources by means of

observation, experimentation or surveys. It is in an unpublished form because it is

obtained from an o r i g i n a l survey or research study.

www.itmuniversityonline.org Page 52
Research Methodology

04. Methods and Tools of Data Collection eBook

It means there are various methods of primary data collection. Some of these important

methods are:

Observation method

Interview method

T h r o u g h questionnaires

T h r o u g h schedules

Other methods i n c l u d e :

o Warranty cards

o Distributor a u d i t s

o Pantry a u d i t s

o Consumer panels

o Using mechanical devices

o Through projective techniques

o Depth interviews

o Content a n a l y s i s

Now, you w i l l study some of these methods in detail:

Observation Method

This is the most commonly used method of primary data collection, especially in studies

related to behavioral sciences. In this method, data is collected by observing things

around. Here, an observation becomes a scientific tool. Information is obtained by an

investigator's own direct observation, without asking the respondent. For example, in

consumer behavior study, the investigator may look at the watch, instead of inquiring

about the brand of the wrist watch that is used by the respondent.

In observation method, information is gathered by watching events, behavior or noting

physical characteristics in their natural setting. Observations can be overt (everyone

knows they are being observed) or covert (no one knows they are being observed and

the observer is concealed). Observations may also be obtained either directly or

indirectly.

In direct observation method, you watch behaviors, interactions or processes, as they

occur. For example, observing teachers teaching a topic from a written curriculum, in

order to determine whether they are delivering it with fidelity. In indirect observation

method, you watch the results of behaviors, interactions or processes. For example,

measuring the quantity of food wasted by students in a school cafeteria, to determine

whether introduction of a new food is acceptable or not.

www.itmuniversityonline.org Page 53
Research Methodology

04. Methods and Tools of Data Collection eBook

Merits

The researcher is a b l e to keep record of the natural behavior of the g r o u p .

The researcher can even gather information which could not be easily obtained, if

the observation is in a disinterested fashion.

The researcher can even verify the truth of statements made by informants in the

context of a questionnaire or a schedule.

Demerits

If the participant participates emotionally, then the observer may lose objectivity.

The problem of observation-control is not solved.

It may restrict the researcher's range of experience.

Source: C. R. Kothari, Research Methodology-Methods and Techniques, Second Revised Edition, New Age

International Publishers, 2004.

Interview Method

In interview method, data is collected by oral-verbal stimuli and replies in terms of oral

verbal responses. Example: telephone interviews and personal interviews.

In personal interview method, a person, known as the interviewer, asks q u e s t i o n s w h i l e

face-to-face with the participant.

Merits

More information can be obtained and in greater depth.

Interviewers can overcome the resistance of the respondents.

There is greater flexibility under this method, it can be a p p l i e d in recording verbal

answers to various questions.

Personal information can be obtained easily.

Demerits

It is a very time-consuming, expensive method, especially when a large and

widely spread geographical sample is taken.

There remains the possibility of the bias of interviewer, as well as, that of the

respondent.

Certain types of respondents, such as important executives, officials or people in

high income g r o u p s may not be easily approachable u n d e r t h i s method and to that

extent, the data may prove inadequate.

www.itmuniversityonline.org Page 54
Research Methodology

04. Methods and Tools of Data Collection eBook

The presence of the interviewer on the spot may over-stimulate the respondent,

sometimes even to the extent that the respondent may give i m a g i n a r y information

just to make the interview interesting.

Source: C. R. Kothari, Research Methodology-Methods and Techniques, Second Revised Edition, New Age

International Publishers, 2004.

Through Questionnaires

Particularly in the case of big enquiries, this method is quite popular. It is being adopted

by private and public organizations, private individuals, research workers, and even by

the government. A questionnaire is sent to the persons concerned, with a request to

answer the q u e s t i o n s and return the questionnaire. The questionnaire is considered as a

main part of the survey method; hence, it is constructed very carefully.

If it is not properly set up, the survey is bound to fail. The general form, question

sequence, question formulation and wording are the main aspects of a q u e s t i o n n a i r e . A

questionnaire is a set of a number of questions typed or printed in definite order on a

paper. The respondents have to answer the questions on their own.

Merits

It is low cost, even when the universe is large and is widely spread geographically.

It is free from the bias of the interviewer; answers are in the respondent's own

words.

Respondents have adequate time to think carefully before a n s w e r i n g .

Respondents, who are not easily approachable, can also be reached conveniently.

Large samples can be made use of, thus the results can be made more d e p e n d a b l e

and reliable.

Demerits

Low rate of return of the d u l y filled in questionnaires; bias d u e to no-response is

often indeterminate.

It can be used only when respondents are educated and cooperating.

The control over questionnaires may be lost once it is sent.

There is inbuilt inflexibility due to the difficulty of amending the approach once

q u e s t i o n n a i r e s have been dispatched.

www.itmuniversityonline.org Page 55
Research Methodology

04. Methods and Tools of Data Collection eBook

It is difficult to know whether willing respondents are truly representative.

This method is likely to be the slowest of all.

Source: C. R. Kothari, Research Methodology-Methods and Techniques, Second Revised Edition, New Age

International Publishers, 2004.

Through Schedules

This method of data collection is very similar to the method of collection of data t h r o u g h

questionnaire. The difference here is that schedules are filled by the enumerators that

are appointed for this purpose. Schedules may be handed over to respondents and

enumerators may help them in recording their answers against the questions.

Enumerators explain the aim and objective of the investigation; they also remove

d i ff i c u l t i e s faced by any respondent in understanding the implications of a particular

question, the definition or concept of difficult terms. The enumerators should be trained

well and the scope and nature of the investigation should be explained to them

thoroughly, so that they can perform. With complete training, they can understand the

implications of different questions put in the schedule. Enumerators must possess the

capacity of cross-examination. They must be intelligent, hardworking, sincere, honest,

and should have patience and perseverance.

Merits

Non-response is generally very low.

In case of schedule, the identity of the respondent is known.

The information is collected ahead of time, as they are filled by enumerators.

Direct personal contact is established with respondents.

Information can be gathered even when the respondents h a p p e n to be illiterate.

Demerits

Collecting data through schedules is relatively more expensive because a

considerable amount of money has to be spent in appointing enumerators and in

providing and imparting training to them. Money is also spent in preparing

schedules.

There remains the danger of interviewer bias and cheating.

There usually remains difficulty in sending enumerators over a relatively wider

area.

Source: C. R. Kothari, Research Methodology-Methods and Techniques, Second Revised Edition, New Age

International Publishers, 2004.

www.itmuniversityonline.org Page 56
Research Methodology

04. Methods and Tools of Data Collection eBook

4 . 2 . 2 Secondary Data

A data which has already been collected by someone else and which has already been

passed t h r o u g h the statistical process is known as secondary data. This data may be in

published or u n p u b l i s h e d form. Secondary data are already a v a i l a b l e . Secondary data is

that data which already been collected and analyzed by someone else. If you are using

the secondary data, then you have to look into the various sources from where it is

obtained.

Sources of Published Data

P u b l i c a t i o n s of the state, central governments.

P u b l i c a t i o n s of international bodies and their subsidiary organizations or of foreign

governments.

Trade and technical journals.

Magazines, books, and newspapers.

Publications and reports of various associations connected with industry and

business, banks, stock exchanges, etc.

Reports prepared by research scholars, economists, universities etc. in different

fields.

Public records and statistics, historical documents of published information. This

type of data is used very carefully.

Sources of U n p u b l i s h e d Data

Letters, diaries, autobiographies, and unpublished biographies.

Scholars and research workers, labor bureaus, trade associations, and other

private/public organization individuals.

Before using secondary data, the researcher must know the following characteristics of

secondary d a t a :

Reliability of Data

The r e l i a b i l i t y of data d e p e n d s on factors like:

Who collected the data?

What were the sources of data?

Were they collected by using proper methods?

At what time were they collected?

Was there any bias of the compiler?

What level of accuracy was desired? Was it achieved?

www.itmuniversityonline.org Page 57
Research Methodology

04. Methods and Tools of Data Collection eBook

Suitability of Data

Data that is suitable for one type of enquiry may not necessarily be suitable to another

enquiry. If the available data is found to be unsuitable, it cannot be used by the

researcher. The researcher must very carefully scrutinize the definition of various terms

and u n i t s of collection used at the time of collecting the data from the primary source.

Adequacy of Data

If the level of accuracy achieved in the data is known and if it is found to be inadequate,

the researcher should not use that data for further study. Data that is related to an area,

which may be either narrower or wider than the area of the present enquiry, will be

considered as inadequate. It means that using secondary data is very risky. If the

secondary data is found to be more reliable, suitable, and adequate, only then

secondary data is used. No one can blindly refuse the use of available data, if it is

a v a i l a b l e from authentic sources. Using secondary data will not be economical to spend

time and energy in field surveys for collecting information.

4 . 3 Q u e s t i o n n a i r e Design

A questionnaire contains a set of questions, especially one addressed to a statistically

significant n u m b e r of subjects, as a way of gathering information for a survey.

The Main Aspects of a Questionnaire

The General Form

The general form of a questionnaire can either be structured or unstructured.

Structured questionnaires are questionnaires in which there are concrete, definite,

and pre-determined questions. The form of the question may be either closed (the

type 'yes' or 'no') or open (inviting free response) but should be stated in advance

and not constructed during questioning.

A wide range of data, in the respondent's own words, cannot be obtained with

structured questionnaires. In such situations, unstructured questionnaires may be

used effectively. On the basis of the results obtained in pretest (testing before

final use) and operations from the use of unstructured questionnaires, one can

construct a structured questionnaire for use in the m a i n study.

www.itmuniversityonline.org Page 58
Research Methodology

04. Methods and Tools of Data Collection eBook

Question Sequence

If the sequence of questions is in a proper manner, it will considerably reduce the

chances of i n d i v i d u a l questions being misunderstood. The question-sequence must

be smoothly-moving and clear, thereby meaning that the relation of one question

to another should be readily apparent to the respondent, with questions that are

easiest to answer being put first.

The following type of questions should be avoided:

Questions that put great strain on the memory or intellect of the respondent.

Questions of a personal character.

Questions related to personal wealth.

Question Formulation and Wording

Questions should be constructed with a logical view to form a carefully considered

tabulation plan. In general, all questions should meet the following standards:

Should be easily understood

Should be simple

Should be concrete

Should conform, as much as possible, to the respondent's way of t h i n k i n g

Since words are likely to affect responses, they should be properly chosen. Simple

words, which are familiar to all respondents, should be preferred. Words with

ambiguous meanings must be avoided. Similarly, danger words, catch-words or

words with emotional connotations should be avoided.

Sample of a Questionnaire

International Students Questionnaire

This is a research study conducted by a group of medical students. Please do NOT write

your name on the questionnaire, as this study is anonymous. Do not feel obligated to

answer a l l q u e s t i o n s if you are uncomfortable or unable to do so. T h a n k you very much

for taking the time to complete our questionnaire, your effort is greatly appreciated.

www.itmuniversityonline.org Page 59
Research Methodology

04. Methods and Tools of Data Collection eBook

1 . Are you male or female?

Male - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 1 c:::J

Female - - - - - - - - - - - - - - - - - - - - - - - - - - - - 2 c:::J

2. What is your current age?

17 - 19 - - - - - - - - - - - - - - - - - - - - - - - - - - - - 1 c:::J

20 - 22 - - - - - - - - - - - - - - - - - - - - - - - - - - - - 2 c:::J

23 - 25 - - - - - - - - - - - - - - - - - - - - - - - - - - - - 3 c:::J

26 - 30 --------- ----- ---- ----- -----4 c:::J

30+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 5 c:::J

3. W h i c h continent are you from?

Eu rope - - - - - - - - - - - - - - - - - - - - - - - - - - - - 1 c:::J

Asia - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 2 c:::J

Africa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 3 c:::J

North America - - - - - - - - - - - - - - - - - - - 4 c:::J

South America - - - - - - - - - - - - - - - - - - - 5 c:::J

Australia/New Zealand ----------6 c:::J

4. What is your first l a n g u a g e ?

English ----------------------------1 c:::J

N o n - E n g l i s h - - - - - - - - - - - - - - - - - - - - - - - 2 c:::J

5. Rate your a b i l i t y to communicate in English, when you came to Canada.

None 1 2 3 4 5 6 7 8 9 Excellent

D D D D D D D D D D D

6. What r e l i g i o n do you identify yourself with?

Budd hist - - - - - - - - - - - - - - - - - - - - - - - - - - - l c:::J

Christian - - - - - - - - - - - - - - - - - - - - - - - - - - - 2 c:::J

Hindu - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 3 c:::J

Islam - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 4 c:::J

Jewish - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 5 c:::J

Non-d enom inationa 1 - - - - - - - - - - - - - - - 6 c:::J

Other - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 7 c:::J

www.itmuniversityonline.org Page 60
Research Methodology

04. Methods and Tools of Data Collection eBook

7. What year of study are you currently enrolled in?

1 -----------------------------------lc::]
2 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 2 c::J

3 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 3 c::J

4 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 4 c::J

Other - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 5 c::J

8. What program are you currently enrolled in?

Bachelor of Arts - - - - - - - - - - - - - - - - - - 1 c::J

Bachelor of Business - - - - - - - - - - - - - 2 c::J

Administration

Bachelor of Science -------------3 c::J

Nursing - - - - - - - - - - - - - - - - - - - - - - - - - - - 4 c::J

Human Kinetics - - - - - - - - - - - - - - - - - - 5 c::J

Source: http://people.stfx.ca/wjackson/questionnaires/2004/Gl WJ04.pdf

4.4 R e q u i r e m e n t s of a Good Questionnaire

Questions should be short and simple.

Questions should proceed in logical sequence moving from easy to difficult

questions.

Personal and intimate questions should be left to the e n d .

Technical terms and vague expressions capable of different interpretations should

be avoided in a questionnaire.

Questions may be dichotomous (yes or no answers), multiple choice (alternative

answers listed) or o p e n - e n d e d .

The control questions, thus, introduce a cross-check to see whether the

information collected is correct or not.

Questions affecting the sentiments of respondents should be avoided.

Adequate space for answers should be provided in the questionnaire to help

editing and tabulation.

The q u a l i t y of the paper, along with its color, must be good so that it may attract

the attention of respondents.

Source: C. R. Kothari, Research Methodology-Methods and Techniques, Second Revised Edition, New Age

International Publishers, 2004.

www.itmuniversityonline.org Page 6 1
Research Methodology

04. Methods and Tools of Data Collection eBook

4 . 5 Case Study

This method involves a careful and complete observation of a social unit. A person, an

institute, a family, a cultural group or even an entire community, are examples of social

units. It is a very p o p u l a r form of qualitative analysis. Case study emphasizes on the full

a n a l y s i s of a limited n u m b e r of events or conditions and their interrelations. It deals with

the process that takes place and their interrelationship and it is also an intensive

investigation of the particular unit under consideration. To locate the factors that

account for the behavior patterns of the given unit, as an integrated totality, is the

objective of the case study method.

Characteristics of a Case Study

For the purpose of this study, the researcher can take one single social unit or

more than one such u n i t .

The u n i t selected for study is studied intensively.

In t h i s method, researcher studies the social unit covering a l l facets completely.

An effort is made to know the mutual interrelationship of causal factors.

Researcher can study the behavior pattern of the concerning u n i t directly and not

by an indirect and abstract approach.

This method results in fruitful hypotheses, along with the data, which may be

helpful in testing them.

Assumptions of a Case Study

The assumption of uniformity in basic human nature, in spite of the fact that

h u m a n behavior may vary according to situations.

The a s s u m p t i o n of studying the natural history of the unit concerned.

The a s s u m p t i o n of comprehensive study of the unit concerned.

Source: C. R. Kothari, Research Methodology-Methods and Techniques, Second Revised Edition, New Age

International Publishers, 2004.

Advantages of Case Study

It enables to fully understand the behavioral pattern of the concerned unit.

A researcher can obtain a real and enlightened record of personal experiences.

It enables the researcher to trace out the natural history of the social u n i t and its

relationship with the social factors and the forces involved in its surrounding

environment.

www.itmuniversityonline.org Page 62
Research Methodology

04. Methods and Tools of Data Collection eBook

It helps in formulating relevant hypotheses, along with data, w h i c h may be helpful

in testing them.

This method facilitates an intensive study of social units, which is generally not

possible.

It h e l p s the researcher in the task of constructing an appropriate q u e s t i o n n a i r e or

schedule for the said task, which requires thorough knowledge of the concerning

universe.

The researcher can use one or more of the several research methods u n d e r case

study method, depending on the prevalent circumstances.

It is beneficial in determining the nature of the units to be studied, along with the

nature of the universe.

Case studies constitute the perfect type of sociological material, as they represent

a real record of personal experiences, which very often escapes the attention of

the most skilled researchers using other techniques.

Case study method enhances the experience of the researcher and this, in turn,

increases their analyzing ability and skill.

This method makes the study of social changes possible. On account of the m i n u t e

study of the different facets of a social unit, the researcher can well understand

the social change, then and now.

Source: C. R. Kothari, Research Methodology-Methods and Techniques, Second Revised Edition, New Age

International Publishers, 2004.

Limitations of Case Study

Case situations are seldom comparable and the information gathered in case

studies is often not comparable as such.

Real information is often not collected because the subjectivity of the researcher

does enter in the collection of information in a case study.

No set rules are followed in the collection of the information and only few u n i t s are

studied. It consumes more time and requires a lot of expenditure.

Case data is often vitiated. Sampling is not possible u n d e r a case study method.

Case study method is based on several assumptions, which may not be very

realistic at times and the usefulness of case data is always subjected to d o u b t .

Response of the respondent is a vital limitation of the case study method.

Source: C. R. Kothari, Research Methodology-Methods and Techniques, Second Revised Edition, New Age

International Publishers, 2004.

www.itmuniversityonline.org Page 63
Research Methodology

04. Methods and Tools of Data Collection eBook

4.6 Chapter Summary

Data are values of qualitative or quantitative variables, b e l o n g i n g to a set of items.

Primary data is that which are collected afresh and for the first time and hence,

happen to be o r i g i n a l in character.

The following methods are for collecting primary data: observation method,

interview method, through questionnaires, and through sc h e d u l e s .

Secondary data are those which have already been collected by someone else and

have already been passed through the statistical process. Secondary data may

either be p u b l i s h e d or unpublished data.

Case study method involves careful and complete observation of a social u n i t .

www.itmuniversityonline.org Page 64
Measurement and

Scaling Techniques
Research Methodology

05. Measurement and Scaling Techniques eBook

5 . 1 Introduction

Measurement is an essential factor in our daily life. To cook a d i s h , one has to m i x the

required i n g r e d i e n t s in proper measurements to get the perfect recipe. If the ingredients

in the recipe are mixed without any measurement, it can spoil the whole recipe. To give

such measurement, one has to have some standards. For this standard, different

measurement scales are used, depending on the variable or objects u n d e r study.

Some standard measurement scales are liter for water, kilogram for weight of an object,

meter for height of an object, etc. These are physical measurements but in sciences and

b u sin e s s research, to measure the attitudes of the respondents from whom data is

collected, also requires an appropriate measurement scale. In this chapter, such

measurement scales that are required in the measuring attitudes w i l l be d i s c u s s e d .

After reading t h i s chapter, you will be able to:

Define measurement and scale

E x p l a i n various scales of measurement

Discuss comparative scaling techniques

Discuss categorical scaling techniques

www.itmuniversityonline.org Page 66
Research Methodology

OS. Measurement and Scaling Techniques eBook

5 . 2 M e a s u r e m e n t and Scaling

Measurement

According to Z i k m u n d , "Measurement is the process of describing some property

of a phenomenon of interest, usually by assigning numbers in a r e l i a b le and

valid way."

According to Nunnally (1978), "Measurement means assigning numbers or other

symbols to characteristics of objects according to certain pre-specified rules."

Thus, when an object or item is measured, it is assigned some numerical value, which,

in t u r n , reflects the property of the object or item. It is important to have knowledge of

the properties and the concept of the object or items for its measurement. Properties

are some specific characteristics of an object or item. It helps in distinguishing one

object from the other. The properties of the objects may be of an objective or subjective

type. The objective properties are the properties that can be physically described, while

subjective properties can only be mentally described. Also, the concept of the object

should be known, w h i c h gives you a brief view about the object, that is, the m e a n i n g .

Scaling

Scaling is defined by Zikmund as, "A device providing a range of values that

correspond to different values in a concept being measured is called scaling."

According to S. L. Gupta and Hitesh Gupta, "Various researchers have classified

scales and scaling techniques based on scale properties, subject orientation,

n u m b e r of dimensions, scale construction techniques, etc."

They have provided various scale construction techniques or approaches:

Arbitrary approach: Scales are developed on an unplanned basis.

Consensus approach: A panel of experts evaluates the chosen items for their

inclusion in the measurement instruments.

Cumulative approach: Scales are chosen based on their conforming to some

ranking of items, with ascending and descending d i sc r i m i n a t i n g power.

Factor approach: Scales are developed based on inter-correlations of items

i n d i c a t i n g the common factor accounting for the relationship between items.

www.itmuniversityonline.org Page 67
Research Methodology

05. Measurement and Scaling Techniques eBook

5 . 3 P r i m a r y Scales of Measurement

The primary scales of measurements are classified into four categories, namely, nominal

scale, o r d i n a l scale, interval scale, and ratio scale.

N o m i n a l Scale

The simplest scale in measurement is the nominal scale. It helps in identifying the types

among different categories, in which it falls. For example, male, female, married,

u n m a r r i e d , etc.

Nominal scale is considered to be a qualitative scale because it involves only categorical

data, rather than metric data. Nominal data may be represented by numbers, like 1, 2,

3, etc. that are in a metric representation. The nominal data can also be represented by

symbols or letters or figures. For example, in a business research, the gender of the

respondent from whom the data is collected is labeled as O for female and 1 for m a l e .

O r d i n a l Scale

Ordinal scale is a scale in which data is categorized according to some common

properties, like nominal scale. However, such data is arranged in either ascending or

descending order. Thus, if data can be ordered accordingly, then such data is termed as

o r d i n a l data.

For example, three biscuit brands that are in very close competition, in terms of t h e i r

taste, can be rated accordingly, with the best tasting biscuit brand as the first, followed

by the other two. Some other examples of ordinal data are:

Socioeconomic status: High, low or poor

Position of student in an examination: First, second, third, etc.

Interval Scale

An interval scale is the same as the ordinal scale, with a d d i t i o n a l information about the

difference between the categories.

In this scale, the units of measurement between the numbers on the scale are all

equidistant, that is, of eq u a l interval. A good example of an interval scale is temperature

(in - c or F ); the scale of temperature from O - 100c is d i v i d ed into 100 equal parts,

that is, the difference between any two successive numbers is fixed.

www.itmuniversityonline.org Page 68
Research Methodology

OS. Measurement and Scaling Techniques eBook

Ratio Scale

It is the most advanced level of measurement. The property on the basis of which it is

measured can be obtained accurately, as a numerical value. Some of the examples of

ratio scale are weight, height, distance, sale, etc., which are numerical values defining

the property for the object, with zero denoting the absence of the property in the object.

Table 5.3a summarizes the different scales.

Numerical Descriptive
Level Examples
Operations Statistics

Employee ID number

Yes or No
Frequencies
Nominal Good or Bad Counting
Mode
Religion: Hindu, Muslim, Christian,

S i k h , etc.

Student class rank Frequencies

Indicate your level of education: Counting Mode


Ordinal
h i g h school, h i g h school diploma, and ordering Median

college degree, X standard pass Range

Frequencies

Mode
Student's Grade Point Average
Common Median
(GPA)
Interval arithmetic Range
1 0 0 - p o i n t job performance rating
operations Mean
provided by supervisor
Variance

Standard deviation

Frequencies

Mode
Salesperson's sales volume
All Median
N u m b e r of stores visited on a
Ratio arithmetic Range
s h o p p i n g trip
operations Mean
A n n u a l family income
Variance

Standard deviation

Table 5 . 3 a : Different Measurement Scales

Source (for the t a b l e ) : Zikmund, Babin, Carr, Griffin, Business Research Methods, Eighth Edition

www.itmuniversityonline.org Page 69
Research Methodology

05. Measurement and Scaling Techniques eBook

Reliability and Validity of a Measurement Scale

Reliability of a measurement has been illustrated in simple words by Bruce Wren,

Robert Stevens, and David Louden as, "A reliable measure is one that

consistently generates the same result over repeated measures. For example, if

a scale shows that a standardized 1 lb weight actually weighs 1 lb when placed

on the scale today, tomorrow, and next Tuesday, then it appears to be reliable

scale. If it reads a different weight, then it is unreliable, the degree of

unreliability indicated by how frequently and by how much it reads an in

accurate weight."

V alid it y is the extent of accuracy of a measure. To determine the v a l i d i t y of a measure is

not a s i m p l e evaluation. Thus, reliability measures consistency and v a l i d i t y measures the

accuracy of a measure, which are criteria for a good measure.

Reliability vs. Validity

According to Z i k m u n d , Babin, Carr, and Griffin, the difference between reliability and

valid it y can be explained by an experiment, as shown in Fig. S.3a, "Suppose an

expert sharpshooter fires a number of rounds with a century-old rifle and

modern rifle. The shots from the older gun are considerably scattered, but

those from the newer gun are closely clustered. The variability of the old rifle

compared with that of the new one indicated it is less reliable. The target on

the right illustrates the concept of a systematic bias influencing validity. The

new rifle is r eliable (because it has little variance), but the sharpshooter's

vision is hampered by glare. Although shots are consistent, the sharpshooter is

unable to hit the bull's-eye."

...--.::
. . .
............. : ._ .

Old Rifle New Rifle New Rifle Sunglare

Low Reliability High Reliability Reliable but not Valid

(Target A) (Target B) (Target C)

Fig. S . 3 a : Bull's-eye with the Shots from Different Rifles

Source: Z i k m u n d , Babin, C.arr, Griffin, Business Research Methods, 8th Edition

www.itmuniversityonline.org Page 70
Research Methodology

OS. Measurement and Scaling Techniques eBook

5 . 4 C l a s s i f i c a t i o n of S c a l i n g Techniques

Scaling techniques can be further classified into two main categories, namely,

comparative scale and categorical or non-comparative scale. They are classified

according to what the responses of the respondents under a study are.

Comparative Scale

In this type of scale, the respondent is asked to compare one object with another and

furnish a response. For example, the respondent is asked to compare two cosmetic

brands, Lakme and Ponds, on its effectiveness. The response, in such a case, can be

either Lakme or Ponds, whichever the respondent finds effective. However, in t h i s scale,

the respondent can only choose the brand he/she finds effective but cannot allocate a

numerical value for the effectiveness.

These types of scales are ordinal in characteristics and so it is also known as non-metric

scale. There is no standard for this scale and different respondents use different

approaches or standards.

Categorical or Non-comparative Scale

In a categorical scale, the respondent has to rate his/her responds in a scale in which

each object is individually evaluated and each object is independent of the other. For

example, three b r a n d s of deodorant are rated on the basis of t h e i r fragrance in a scale

of 1 to 5. Where 1 indicates the best and 5 indicates the worst fragrance, as shown in

Fig. S.4a.

Q. Rate the frag ranee of the following brands in the scale below:

Brand 1

1 2 3 4 5

Best Better Good Bad Worst

Brand 2

1 2 3 4 5

Best Better Good Bad Worst

Brand 3

1 2 3 4 5

Best Better Good Bad Worst

Fig. S.4a: Non-comparative Scale

www.itmuniversityonline.org Page 7 1
Research Methodology

05. Measurement and Scaling Techniques eBook

The comparative scale is further classified into two different scales: Rank order and

paired comparisons. Also, categorical or non-comparative scale is classified as

continuous rating scale and itemized scale. The itemized rating scale is further d i v i d e d

into four different scales:

Like rt Sea le

Semantic Differential Scale

C u m u l a t i v e scale

Stapel Scale

The classification of the measurement scaling techniques is shown in Fig. 5.4b.

Scaling

Techniques

Comparative
Categorical Scale
Scales

Itemized
Paired Continuous
Rating
Comparison Rating Scale


Scale

Semantic
Cumulative
Differential
Scale


Scale

Fig. 5.4b: Classification of Measurement Scaling Techniques

5 . 5 C o m p a r a t i v e Scales

The different types of comparative scales are:

Rank Order Scale

In t h i s scale, the respondent is asked to rank several objects based on certain properties

or criteria. It is the simplest and quickest to apply. Ranking the extremes is very easy in

t h i s method but ranking the objects in between is difficult.

www.itmuniversityonline.org Page 72
Research Methodology

OS. Measurement and Scaling Techniques eBook

Example:

Rank the following soft d r i n k s , on the basis of their taste, from 1 for the best tasting one

to 5 for the worst tasting one.

Coca-Cola

7-Up

Sprite

Pepsi

Mountain Dew

Paired Comparison

As the name implies, this scale involves the comparison of different pairs of objects.

Here, the respondent is provided with pairs of objects and he/she has to select one of

the objects from the pair, on the basis of some criteria.

If there is ' n ' n u m b e r of objects, then there will be n(n2- l) pairs to be compared.

The data obtained in this scale is ordinal and the responses of the respondent obtained

can be transformed into a matrix form. This method was given by L. L. Thurstone

(1927).

The method can be explained with the help of an example; if six apparel brands are

compared, namely, B i b a , Aurelia, W, Global Desi, Kimaya, and Fab I n d i a , on the basis of

t h e i r d e s i g n and a sa m p l e of 100 female customers or respondents are taken. There w i l l

5 52 1
be 15 pairs [ < - >]. After each respondent furnishes their response in the forms,

t h e i r preferences can be presented in a matrix form as illustrated in Table s . s a .

Fab
Biba Aurelia w Global Desi Kimaya
India

Biba x 0 1 1 1 1

Aurelia 1 x 1 1 1 1

w 0 0 x 0 0 0

Global Desi 0 0 1 x 0 0

Kimaya 0 0 1 1 x 1

Fab India 0 0 1 1 0 x

Total 1 0 5 4 2 3

Table 5 . S a : Table for Paired Comparison Scale Method

www.itmuniversityonline.org Page 73
Research Methodology

05. Measurement and Scaling Techniques eBook

In Table 5.Sa, the value 1* indicates that the brand in that column (that is, Biba) is

preferred over the brand in the row (Aurelia).

5 . 6 C a t e g o r i c a l Scales

The different types of categorical scales are:

Continuous Rating Scale

In this scale, the respondents are to mark ( <) an appropriate position, which is

considered by them to be the favorable case in a number scale or pictorial scale. The

rating scale is represented by line diagrams, scale with pictures, and others.

Example:

1. How would you rate the services of DTDC as country-specific courier service?

Very good Good Quite good Neither Quite bad Bad Very bad

Fig. 5 . 6 a : Continuous Rating Scale

2. Is the conducted test easy or difficult?

Very Easy Very Difficult

Or

Very Easy Very Difficult

7 6 5 4 3 2 1

Fig. 5 . 6 b : Continuous Rating Scale

3. Are you satisfied with your job?


0 1 2 3 4 5

Very Very

Satisfied Unsatisfied

Fig. 5.6c: Continuous Rating Scale

In the above question 3, a pictorial scale is used.

www.itmuniversityonline.org Page 74
Research Methodology

OS. Measurement and Scaling Techniques eBook

Itemized Rating Scale

Itemized rating scale is also known as a numerical rating scale. In t h i s scale, a series of

statements are given, from which the respondent needs to select the statement

according to the response. Unlike the continuous scale, this scale gives a rating scale in

which the respondent can select the favorable statement. The measurements in the

scale used should be of odd categories, preferably and most likely to be five to nine

categories.

According to variables, respondents, etc., the itemized rating scale has the following

divisions:

Likert Scale (Summated Scale)

Semantic Differential Scale

Cumulative Scale

Stapel Scale

Likert Scale (Summated Scale}

This scale was developed by Rensis Likert in 1932. According to Mukul Gupta and

Deepa Gupta, "Summated scale consists of a number of statements which

express either a favorable or unfavorable attitude towards the given object to

which the respondent is asked to react. The respondent indicates his

agreement or disagreement with each statement in the instrument."

This scale is a five-point scale and the scale ranges from 1 to 5 or - 2 to 2, with O as the

neutral response.

Example:

1. Are you interested in mutual fund investments?

Very interested Somewhat Neutral Not very Not at a l l

interested interested interesting

5 4 3 2 1

Table 5.6a: Likert Scale

2. Was the workshop on 'Marketing Research' useful to you?

Strongly Disagree Neutral Agree Strongly Agree

Disagree

- 2 - 1 0 1 2

Table 5.6b: Likert Scale

www.itmuniversityonline.org Page 75
Research Methodology

05. Measurement and Scaling Techniques eBook

Semantic Differential Scale

Charles E. Osgood, G. J. Suci, and P. H. Tennenbnum ( 1 9 7 5 ) developed the scale known

as semantic differential scale. In this scale, words, rather than numbers, are used in a

seven-point scale, that is, two adjectives are placed in the extreme points of the scale

and the respondent has to select the value, accordingly, in the b i p o l a r scale.

Q. Rate the furniture of the brand 'Zuari':

Strong Weak

Expensive Inexpensive

Fashionable Unfashionable

Fig. 5.6c: Semantic Scale

C u m u l a t i v e Scale

It is also known as Louis Guttman's scalogram analysis, named after the person who

developed it. It consists of some statements, which the respondent has to give his/her

verdict of agreement or disagreement on.

According to C. R. Kothari, "Scalogram analysis refers to the procedure for

d e t e r m i n i n g whether a set of items forms a unidimensional scale."

Respondent's
Questions
Score

4 3 2 1

,/ ,/ ,/ ,/
4

,/ ,/ ,/
x 3

,/ ,/
x x 2

,/
x x x 1

./ = Agreement ; x = Disagreement

Table 5 . 6 d : Response Pattern in Scalogram Analysis

www.itmuniversityonline.org Page 76
Research Methodology

OS. Measurement and Scaling Techniques eBook

From Table 5.6d, you can see that the respondent is provided with two options,

agreement and disagreement, for each question asked. The questions are built up in

such a way that if the respondent's answer is positive to question 2, then the

respondent's answer to question 1 may also be positive. Likewise, if the respondent's

answer for que st i o n 3 is agreement, then the answer for question 1 and 2 may be

agreement. Hence, it is also termed as a cumulative scale.

Stapel Scale

The stapel scale was developed by Jan Stapel. It is a 10-point interval scale from

+ 5 to - 5. But there is no neutral point 0. This scale is unipolar because only one

adjective is under consideration. It even has a number of categories, unlike the other

scales.

Example:

How would you rate the extent to which the tag line of a p a rt i c u l a r product matches

accordingly?

(+5)

(+4)

(+3)

(+2)

(+1)

Perfectly Matches

{-1)

(-2)

(-3)

(-4)

(-5)

Fig. 5.6e: Stapel Scale

This scale is u s u a l l y presented vertically. Respondents select their response on the basis

of t h e i r perspective as to what degree the word perfectly matches is appropriate to the

www.itmuniversityonline.org Page 77
Research Methodology

05. Measurement and Scaling Techniques eBook

context. The larger positive number indicates high accuracy, while a smaller negative

n u m b e r indicates less accuracy.

Table 5.6d summarizes the different measurement scales.

Non-comparative Scale

Basic
Scale Examples Advantages Disadvantages
Characteristics

Continuous Place a mark on a Reaction to 1V Easy to Scoring can be

Rating Scale c o n t i n u o u s line commercials construct cumbersome,

unless

computerized

Itemized Rating Scale

Likert Scale Degree of Measurement Easy to More time

agreement on a 1 of attitudes construct, consuming

(strongly administer, and

disagree) to 5 understand

(strongly agree)

Semantic Seven-point scale Brand, product, Versatile Controversy

Differential with bipolar and company about whether or

Scale labels images not the data are

interval

Stapel Scale U n i p o l a r ten- Measurement Easy to Confusing and

point scale, - 5 of attitudes and construct; difficult to a p p l y

to + 5, without a images administered

neutral point over telephone

(zero)
.
Table 5 . 6 d : Different Measurement Scales

Source: Naresh K. Malhotra, Satyabhushan Dash, Marketing Research, An applied Orientation, Fifth

Edition, Pearson Education, 2007

www.itmuniversityonline.org Page 78
Research Methodology

OS. Measurement and Scaling Techniques eBook

Before preparing a non-comparative itemized rating scale, you should keep the following

factors in m i n d , w h i l e u s i n g an itemized scale.

The n u m b e r of categories under study.

Check whether a balanced or unbalanced scale is to be used. A balanced scale is

the scale in which the positive and negative categories are equal, while in an

unbalanced scale, the scale does not have equal number of categories.

Odd or even n u m b e r s of categories under study.

The scale must be a forced or an unforced rating scale. In a forced rating scale,

respondents are forced to select an option in the middle of the scale, as it does

not contain a 'no opinion or comment' option, while in the unforced scale,

respondents are g i v e n an option of 'no opinion', if they find that the options are

not a p p l i c a b l e or cannot disclose their response.

Verbal description about the categories in different scales varies. So it should be

presented in such a way that the responses are close to the q u e s t i o n .

The scales can be either horizontally or vertically presented. They can also be

represented by boxes, lines with numbers or without numbers.

www.itmuniversityonline.org Page 79
Research Methodology

05. Measurement and Scaling Techniques eBook

5 . 7 Chapter Summary

Measurement is the process of assigning numbers or scores to characteristics or

attributes of the objects or people of interest.

Primary scales of measurements are:

o N o m i n a l Scale: Labeling of objects

o O r d i n a l Scale: Ranking the objects

o Interval Scale: Expressing relative meaning

o Ratio Scale: Expressing absolute values

S c a l i n g t ec h n i q u e s are classified as follows:

o Comparative Scaling Technique: Comparing two objects

o Categorical Scaling Technique: Rating the same attribute of object

www.itmuniversityonline.org Page 80
Tabulation and

A n a l y s i s of D a t a
Research Methodology

06. Tabulation and Analysis of Data eBook

6 . 1 Introduction

In t h i s chapter, t a b u l a t i o n and analysis of data will be discussed. T a b u l a r representation

helps to view large data in a compact manner. It also enables us to compare different

variables at one time. The summing of the values of different variables becomes easy

when it is expressed in a tabular form. Thus, tabulation of data helps to get a precise

overview of the data.

Analysis of data is the most essential part of a research, which leads to the conclusion of

the research. The a n a ly s i s of data should be done scientifically, by using various

statistical techniques or tools. In this chapter, some of the statistical t e c h n i q u e s used for

an alys i s of data are d i s c u s s e d . By using an appropriate statistical tool, correct f i n d i n g s in

a research are o b t a i n e d .

After reading t h i s chapter, you will be able to:

Define t a b u l a t i o n and its parts

State the p r i n c i p l e s of tabulation

Discuss m u l t i p l e regression analysis

Discuss m u l t i p l e d i sc r i m i n a n t analysis

Discuss the measures of central tendency

E x p l a i n the measures of dispersion

Describe the measures of skewness

E x p l a i n the measures of relationships

Describe the association of attributes

E x p l a i n measures like time series and index numbers

www.itmuniversityonline.org Page 82
Research Methodology

06. Tabulation and Analysis of Data eBook

6.2 Tabulation

Tabulation is the representation of data in a compact form. It is used in all kinds of

reports, articles, j o u r n a l papers, etc. to summarize a particular data. The use of tables is

in numerous but its importance in research is immense. It is h i g h l y recommended, as the

data is organized in a systematic manner, reflecting the information of the data used for

the table and it also helps in further analysis required for the interpretation of the

research f i n d i n g .

As explained by Tuttle, "The logical listing of related qualitative data in vertical

columns and horizontal rows of numbers with sufficient explanatory and

qualifying words, phrases and statements in the form of titles, headings and

explanatory notes to make clear the full meaning, context and the o r i g i n of the

data.''

Thus, the technique of organizing the given data in a tabular form is known as

tabulation. In a table, it is very important that the headings for each of the rows and

c o l u m n s are properly inserted according to the data. There is always a title to define the

table provided and also, if there is a source from which the table is taken or any other

additional information relating to the table, it is provided below the table. A table

consists of the table title, along with the table number, rows, c o l u m n s , and footnotes, as

shown in Fig. 6 . 2 a .

Table N u m be r - Table 2: Rwandan mineral production (1995-2000) - Title of the Table

Year Gold Cassiterite Coltan Diamond -column

Production Production Production Exports

(kg) (tons) (tons) (USS)

1995 1 247 54 n/a

1996 1 330 97 n/a - Row

1997 10 327 224 $720,425

1998 17 330 224 $16,606

1999 10 309 122 $439,347

2000 10 437 83 $1, 788,036

Sources: Coltan, cassiterite and gold figures derived from Rwandan Official Statistics

Source +- (No. 227/01/10/MIN): diamond figures from the Diamond High Council. (All figures

originally appeared in the UN Panel of Inquiry, 2 0 0 1 . All 2000 figures are to October.)

Fig. 6 . 2 a : Parts of a Table

www.itmuniversityonline.org Page 83
Research Methodology

06. Tabulation and Analysis of Data eBook

The table title should be concise, reflect the complete meaning of the representation and

must follow after a table number. In Fig. 6.2a, the table is about mineral production

from the year 1995-2000, taken from Rwanda official statistics and is appropriately

titled, "Rwanda: mineral production, 1995 - 2000".

The horizontal and vertical representation of a data is known as rows and c o l u m n s of the

table, respectively. In Fig. 6 .2 a , there are four columns and six rows. Each data is

placed in an individual cell. If, other than the source, there is more information to be

furnished in the table, like short-forms or abbreviations used in the table, they can be

provided below the table after the source (if any). All this information is together known

as footnotes.

The p r i n c i p l e s of a table can be stated as:

1. Every table must contain a title that is clearly understandable and concise.

2. A table should be properly numbered for future reference.

3. The c o l u m n h e a d i n g and row heading should be clear and short.

4. The u n i t s of measurement should always be mentioned wherever necessary in row

or column heading. For example, as illustrated in Fig. 6.2a, the units of

measurement, like 'kg' and 'tons' are provided.

5. The footnotes should be placed beneath the table, along with any other a d d i t i o n a l

information needed to be highlighted in the table.

6. The source should be below the table.

7. The c o l u m n s can be numbered for reference. If a table contains ten c o l u m n s , then

by n u m b e r i n g the columns, it helps in its reference.

8. The data of the columns that are to be compared should be placed side by side.

9. A l i g n m e n t of values in the cells should be proper. Also, the decimal point and signs

like + and - should be properly aligned.

1 0 . Abbreviations should be rarely used, only if necessary, and also, ditto marks

should be avoided.

1 1 . Representation of large data in a table can make it look messy and unclear. So, in

such cases, the data should not be clumped in one single table.

1 2 . The row totals are placed at the extreme right column of the table, while the

c o l u m n totals are placed in the last row of the table.

The tables, figures, and graphs help in the brief representation of the data, which helps

in the a n a l y s i s of the f i n d i n g . The analysis of the data is done accordingly, d e p e n d i n g on

the data type and size of the data. It is very important to implement an appropriate

www.itmuniversityonline.org Page 84
Research Methodology

06. Tabulation and Analysis of Data eBook

statistical technique, in order to get the appropriate results. There are many statistical

techniques used in data analysis. In this chapter, some of the important statistical

techniques are discussed.

6.3 M u l t i p l e Regression Analysis

Regression a n a l y s i s enables us to obtain an equation depicting the relationship between

two variables. Similarly, multiple regression analysis enables us to obtain the

relationship between more than two variables.

Consider that X1 and X2 are two independent variables and Y is a dependent variable.

The m u l t i p l e regression equation can be expressed as:

Y=a+b 1
X 1
+b2X2

Where,

X 1 and X2 = Independent variables

Y = Dependent variable

a, b1, b2 = Constants

To obtain the regression equation, you need to calculate the values of the constants.

This can be done by solving the following normal equations:

IY = Na+b,IX, +b,IX,

I X , Y = aI X, + b, I X ; + b,I X 1 X 2

I X2Y = aI x 2
+ b,I x x
1 2
+ b,I Xj

6 . 4 M u l t i p l e D i s c r i m i n a n t Analysis

Discriminant analysis is appropriate in situations where the independent variable is

quantitative and the dependent variable is categorical in nature. For example, on the

basis of an applicant's age, income, length of time at present home, etc., a credit

m a n a g e r wishes to classify a person as either good or poor credit risk.

Discriminant analysis is suitable with nominal dependent variable and interval

independent variables. It is the technique to analyze data when the criterion or

dependent variable is categorical and the predictor or independent variables are interval

www.itmuniversityonline.org Page 85
Research Methodology

06. Tabulation and Analysis of Data eBook

in nature (Lachenbruch, 1975). The discriminant analysis involving more than two

variables is known as a multiple discriminant analysis.

The equation showing such a relationship between 'n' variables is called a discriminant

function and is g i v e n a s :

z = w,x, + wx 2 2
+ ... + w"x"

Where,

Z = D i s c r i m i n a n t scores

W; = D i s c r i m i n a n t weight for variable X;

X; = i'h i n d e p e n d e n t variable

Discriminant analysis is widely used in business research. According to Gupta (2003),

following aspects can be studied by this type of analysis:

Identification of new buyer group

Consumer behavior towards new products or brands

Brand loyalty study

Relationship between variables

Checklist of properties of new products

6 . 5 M e a s u r e s of Central Tendency

Measures of central tendency help to obtain a representative value to study the

characteristics of the population from which the data is extracted. It is also known as a

statistical average as an average value of the data is obtained when these measures are

computed.

The measures of central tendency are also known as measures of location as the value

obtained from such computation indicates the position/location of the value. The

measures of central tendency are:

Mean/Arithmetic mean/Average

Mode

Median

H a r m o n i c Mean

Geometric Mean

www.itmuniversityonline.org Page 86
Research Methodology

06. Tabulation and Analysis of Data eBook

6.5.1 Mean/ Arithmetic mean/ Average

It is the most commonly used measure of central tendency and the simplest of a l l . It is

obtained by dividing the sum of all the values of the observation by total number of

observations.

Consider the values of ' n ' observations X1, X2, X3 ... Xn, then, the arithmetic mean is

given a s :

LX,
X=-'
n

Where,

X = Arithmetic mean

L x, = S u m of a l l the 'n' observations

'

n = Total n u m b e r of observations

For example, if leaves taken by 10 employees in a bank, in the last three months are 2,

4, 6, 1, 2, 3, 1, 2, 1, and 5, then, the average leaves is obtained by the arithmetic

mean:

X = 2 + 4 + 6 + 1 + 2 + 3 + 1 + 2 + 1 + 5 = 27 = 2.7

10 10

While, if for these 'n' observations, the corresponding frequencies are given, that is,

f 1, f 2, f3 ... f0 then its arithmetic mean is given as:

Lf,X,

X= If,
I

Where,

X = Arithmetic mean

L f,X, = Sum of the product of the observations and their corresponding

'

frequencies

L f, = Total frequency

'

www.itmuniversityonline.org Page 87
Research Methodology

06. Tabulation and Analysis of Data eBook

Example: If you have a frequency distribution as illustrated in Table 6 . 5 . l a .

I I I I I ! I

Table 6 . 5 . l a : Frequency Distribution Table

The arithmetic mean is g i v e n a s :

If,X,

X='--
If,

'

For t h i s formula, 'fx' is calculated as illustrated in Table 6 . 5 . l b :

x 0 1 2 3 Total

f 2 4 6 8 20

fx 0 4 12 8 24

Table 6 . 5 . l b : Frequency Distribution Table

- 24
Therefore, X = - = 1.2
20

6.5.2 Median

Median is the middle value of a data when all the observations are arranged in

ascending or descending order of their magnitude. This is applicable if the data under

consideration is an u n g r o u p e d data.

For example, the monthly sales (in thousands) of 9 salespersons of an insurance

company are 10, 11, 13, 12, 17, 18, 9, 14, and 15. To obtain the m e d i a n for t h i s data,

first, the observation is arranged in ascending order, that is, 9, 10, 11, 12, 14, 15, 17,

9 1
and 18. Then, the value of the observation in the ( ; )"' position, that is, 5th position

gives the m e d i a n and is eq u a l to 14.

However, suppose that in the above example, there are 10 salespersons and the

observations are 10, 11, 13, 12, 17, 18, 9, 14, 15, and 17. Similarly, the observations

are arranged in ascending order; 9, 10, 11, 12, 14, 15, 17, 17, and 18. Then, the

www.itmuniversityonline.org Page 88
Research Methodology

06. Tabulation and Analysis of Data eBook

median is the value obtained as the mean of the values of the observations in the

14; 15
(
12
)th and (
1
i + 1)"' positions. Thus, the median is ( ) = 14.5

A g a i n , for a discrete frequency distribution:

x f c.f.

x, f, c,

X2 f2 C2

X3 f, C3

... ... ...

Xn fn Cn

Total N

. . .

Table 6.5.2a: Frequency D1stnbut1on Table

First, you have to compute ( )

Then, the value of c.f. just greater than the value obtained from () is found and the

value of X, corresponding to the value of c.f., is the median. Also, for a continuous

frequency table the calculation of median is different from the discrete frequency table.

Consider the following distribution of employees' bonus for a company.

x f c.f.

0 - 10 2 2

10 - 20 4 6

20 - 30 6 12

30 - 40 8 20

Total 20

. . .
Table 6 . 5 . 2 b : Distribution of Employees' Bonus

www.itmuniversityonline.org Page 89
Research Methodology

06. Tabulation and Analysis of Data eBook

Thus, the m e d i a n is g i v e n by the formula:

rz -
2
c.f.
Median= L + f x i

Where,

L = Lower l i m i t of the m ed i a n class

c.f. = C u m u l a t i v e frequency of the class preceding the median class

f = Frequency of the median class

i = Class interval of the median class

2
2
Therefore, from Table 6 . 5 . 2 b , = = 10, then c.f. just greater than 10 is 12 and 12

corresponds to the class 20 - 3 0 .

Thus, m e d i a n class is 20 - 3 0 , L = 20, c.f. = 6, f = 6 and i = 10

rz -2
c.f.
Median= L + f x i

6
= 20 + lO - x 10
6

4
= 20 + x 10
6

= 20 + 6 . 6 6 7

= 26.667

Thus, the m e d i a n d i v ides the whole data series into two equal parts. In add it ion, it is not

affected by the extreme values.

6 . 5 . 3 Mode

It measures the most frequently occurring value in the data. For an ungrouped discrete

data, the mode value is the value of the observation which occurs for m a x i m u m number

of times.

For example, if the age of the employees in the marketing department is g i v e n as: 25,

22, 35, 29, 42, 26, 36, 25, 29, 28, and 29. Here, 29 occurs m a x i m u m n u m b e r of times,

that is, 3; therefore, mode is 2 9 .

www.itmuniversityonline.org Page 90
Research Methodology

06. Tabulation and Analysis of Data eBook

For a grouped continuous frequency distribution, the mode is obtained by the following

formula:

M = L + f,-fo x i
0
2f, - f, - f,

Where,

L = Lower l i m i t of the modal class

f1 = Frequency of the modal class

fa = Frequency of the class preceding the modal class

f, = Frequency of the class succeeding the modal class

For example, consider the following frequency distribution.

x f c.f.

0 - 10 5 5

10 - 20 14 19

20 - 30 23 42

30 - 40 8 50

Total 50

Table 6 . 5 . 3 a : Distribution of Employees' Bonus

Here, the highest frequency is 23, thus, the modal class is 20 - 3 0 .

Also, L = 20, f1 = 23, f a = 14, and f, = 8

23 14
M
0
= 20 + - x 10
2 x 23 - 14 - 8

9
= 20 + x 10
46-22

9
= 20 + 24 x 10

= 2 3 . 75

The mode can be determined graphically. In some cases, there can be two or more

modes in a single data. Thus, for such a data, the interpretation of results is very

d ifficu It.

www.itmuniversityonline.org Page 9 1
Research Methodology

06. Tabulation and Analysis of Data eBook

6 . 5 . 4 Geometric Mean

The nth root of the product of all the observations is called a geometric mean and is

abbreviated as G.M. That is, if xi , x2 ... x0 is a given set of 'n' observations, then the

geometric mean is:

G . M. = x 1 .x 2 x,

First taking the logarithm and then antilogarithm above, the formula can be written a s :

L logx )
G.M. = a n t i log n '
(

For u n g r o u p e d data, if the scores obtained by 5 candidates in an aptitude test are 5, 6,

5. 5, 8, and 5.

Therefore, G. M. = x 1 .x 2 x, = Vs x 6 x 5.5 x 8 x 5 = V6600 = 5.806

For discrete frequency distribution, the formula for G . M . is:

G , M. = X1f1X2f2 , , , Xnfn

or

_ . (Lf,logx,J
G.M. - antiloq N

Where, N = L

For c o n t i n u o u s frequency distribution, from the class intervals, the mid-values ( rn . ) are

calculated, then the formula for G . M . is given as:

Lf,logm,J
G.M. = antilog N
(

www.itmuniversityonline.org Page 92
Research Methodology

06. Tabulation and Analysis of Data eBook

6.5.5 H a r m o n i c Mean

Harmonic mean or H.M. is reciprocal of the average of the reciprocals of the

observations. That is, if x 1 , x2 ... x, is a given set of 'n' observations, then the harmonic

mean is g i v e n a s :

H a r m o n i c Mean (H.M.) = =._1-+---1-+_n_


..-
. . -.+
-_-
1
-
= Ln}__

X1 X2 x, Xi

The above equation is used for ungrouped discrete data.

For example, the H . M . for the observations 4, 6, and 2

3 3 12
H.M. = = x = 3.273
1 1 1 11
- + - + -
4 6 2

For grouped c o n t i n u o u s distribution, as in Table 6 . 5 . 2 a , the harmonic mean is given a s :

N
H.M. =----
1)
"[,(f, x
x,

For continuous series,


H.M.= ( N 1 r where ' rn . : is the middle value of the i'h

I, f,x
m,

classes, and N = J,

Mid-value f
x f -
m m

0 - 10 5 2 0.4

10 - 20 15 4 0.267

20 - 30 25 1 0.04

30 - 40 35 3 0.086

Total 10 0.793

Table 6 . 5 . 5 a : Frequency Distribution Table

10
12.610
0.793

www.itmuniversityonline.org Page 93
Research Methodology

06. Tabulation and Analysis of Data eBook

6 . 6 M e a s u r e s of Dispersion

Dispersion in statistics means the variability or spread of the observation in data. The

measure of d i s p e r s i o n helps to identify the suitability of the data, that is, if the data is

more scattered, then it is not reliable, otherwise, it is reliable.

The most p o p u l a r measures of dispersion are:

Range

Mean deviation

Standard deviation

6.6.1 Range

Range is the simplest measure of dispersion. It is defined as the difference between the

highest and the lowest value of the data series and is given a s :

Range= H i g h e s t value - Lowest value

Since range is not based on all observations, its application is limited.

6.6. 2 Mean Deviation

Mean deviation or M.D. is defined as the average of the sum of all deviations of the

values in a series from its mean and it is given as:

1 n

M.D. = - I Ix, -

n 1= 1

Where,

www.itmuniversityonline.org Page 94
Research Methodology

06. Tabulation and Analysis of Data eBook

In t h i s measure, the negative sign ( - ) of deviation is ignored in its c a l c u l a t i o n . Also, for

frequency d i s t r i b u t i o n , as in Table 6.5.2a, its formula is given a s :

1 '
M.D. = - I;f1IX1 -
N 1 = 1

Where,

'
I; f , x ,

X = =
' '-
N

For example, consider the following frequency table:

x f fx Ix- x-

1 5 5 2.9 14.5

2 10 20 1.9 19

3 15 45 0.9 13.5

4 30 120 0.1 3

5 40 200 1.1 44

Total 100 390 94

Table 6 . 6 . 2 a : Table for the Calculation of M . D .

n Lf1X1
1
From Table 6 . 6 . 2 a , N = L f1 = 100 and x = =
1
= 3.9
1=1 N

Therefore, M.D.
1
= - "
L f,
I
x, - vi
x =-1
- x 94 = 0. 94
N1=i 1 100

6 . 6 . 3 Standard Deviation

Standard deviation is the most widely used measure of dispersion. It is defined as the

positive square root of the average of the squares of deviations of the observation from

its mean and is denoted by a. It is given as:

CT =
1 ' (
- L X; -
-)2
x
n i = 1

www.itmuniversityonline.org Page 95
Research Methodology

06. Tabulation and Analysis of Data eBook

For frequency d i s t r i b u t i o n , standard deviation is given a s :

1 n ( -)2
cr = - I: f x, -
1
x
N 1 = 1

2 2
The value a2- is the variance of the data, that is, cr = .!. L (x, - X)
n

For example, for the g i v e n data, the standard deviation is calculated as:

Mid-value -
Class Interval f fx x-X t(x- xf
(x-xf
x

0 - 3 2 1 3 -2.875 8.266 8.266

3 - 6 4 5 20 -0.875 0.766 3.83

6 - 9 8 2 16 3.125 9.766 19.532

Total 8 39 31.628

Table 6 . 6 . 3 a : Frequency Distribution

"
I; f , x ,
1
39
x = ,= = = 4 875
N 8 .

o = ..!. f
1
(x 1
- XJ = /..!. x 31.628 = 1.988
N 1 = , 'V 8

Thus, the standard deviation is equal to 1. 988

6 . 7 M e a s u r e s of Skewness

The measure of skewness describes how asymmetric the d i s t r i b u t i o n is. In a symmetric

distribution, the mean, mode, and median lie on the same point that d i v i d e s the whole

distribution into two equal parts. However, in case of an asymmetric distribution, the

mean, m ed i a n , and mode do not lie on the same point. Fig. 6.7a gives the shape of

different curves:

www.itmuniversityonline.org Page 96
Research Methodology

06. Tabulation and Analysis of Data eBook

Mode Mean = Mode = Median Mode

Median Median

Mean
I I

I I I I

Negatively skewed Normal curve Positively skewed

Fig. 6 . 7 a : Symmetrical and Asymmetrical Distribution

The first figure shows a negatively skewed curve, where mode > median > mean.

However, the third figure shows mode < median < mean. While, in the second figure,

mode = m e d i a n = mean.

Prof. Karl Pearson has defined the measure of skewness, which is called Karl Pearson's

coefficient of skewness, a s :

M e a n - Mode
Karl P e a r s o n ' s Coefficient of Skewness=----- ................. (1)
Standard Deviation

If mean >, = or < mode, then the equation (1) is positively skewed, symmetrical or

negatively skewed, respectively.

However, if the data has two or more modes, then the above equation (1) is given a s :

. . 3(Mean-Median)
Karl P e a r s o n ' s Coefflclent of Skewness= d d (2)
Stan ar Deviation

If mean >, = or < median, then the equation (1) is positively skewed, symmetrical or

negatively skewed, respectively.

6 . 8 M e a s u r e s of Relationships

In a data with many variables, you can use different statistical measures for calculation

of relationship between the different variables. For a bivariate data, a cross tabulation,

Charles Spearman's correlation coefficient, Karl Pearson's correlation coefficients or

s i m p l e regression can be used for measuring the relationships between the variables.

Cross Tabulation

In a cross table representation, the data is represented such that the variables can be

compared with each other. The whole data is arranged into categories and further

www.itmuniversityonline.org Page 97
Research Methodology

06. Tabulation and Analysis of Data eBook

d i v i d e s it into two or more sub-categories. So, the row categories and c o l u m n categories

are compared and the cross table is filled in, according to the g i v e n data.

For example, Fig. 6.Sa is a cross tabulation representation.

(Al Cross-Tabulation of Qustion ..Hava you followiNI the nw storis about AIG bonuses]"

Total Gender Ag
-- -----
Adults Men Women 18-29 30-39 40-49 So-64 65+

Closely Followed News Stories Very dosely


.,,. er .. ....
... ...
60" 51" 49"

about AlG Bonuses? Somewhat closely "" ao .. 35"


, ""
, ,s .,.

Not very closely ...


"" ... "" s.. 4'6 4'6 ""
Not at all '"
1 .,. 4'6 '" 4" '"
1 .,.

Not sure '" "'


0'6 '" '"
1 '6

'" '" '" "' '" "'


(Bl Cross-Tabulation of Question "Is tha bailout monay going to thosathat craatN the crisis?"

Total Gand@r Ag
-- -----

Adults Man Woman 18-29 30-39 40-49 50-64 65+

Most Bailout Money Going Ye, .... .,.,. .... .,.,. r .,. .... ,o.,.
61"

to People lNho Created Crisis? No 18'6 14'6 14'6 16'6 16%

Not sure 14'6 ""


10'6 ...
"" 10'6 "" 14'6

"" "" ""


Fig. 6 . S a : Cross Tabulation Representation

Source: Z i k m u n d , Baabin, Carr, Griffin, Business Research Methods, sm Edition.

Karl Pearson's Correlation Coefficient

It is the most popular statistical test for measuring the relationship between two

variables. It can only detect the extent of relation between the variables but does not

give any information about the cause and effect of the relationship.

The Karl Pearson's correlation coefficient for two variables X and Y is g i v e n a s :

Cov(X, Y) L (x, - X)(y, - Y)


r = or r = -';====sea======
cr , cr , J(x, - 2
X) (y
1
- Y)'

Where,

X = Mean of X variable

Y = Mean of Y variable

a,= Standard deviation of X variable

o ; = Standard deviation of Y variable

Cov(X, Y) = Covariance between X and Y variables

www.itmuniversityonline.org Page 98
Research Methodology

06. Tabulation and Analysis of Data eBook

The value of r lies between -1 and 1. If the values of r = -1, then there is a perfect

negative correlation between the variables. If r = 1, then there is a high degree of

correlation or a perfect correlation between the variables.

Spearman's Correlation Coefficient

When the data is in ordinal scale of measurement, Karl Pearson's correlation coefficient

fails to determine the relationship between the variables, in such a case, Spearman's

Rank correlation coefficient is used and the values of the variables are assigned ranks.

Then, it is calculated by the formula:

r - 1 - 6 L.,
'\' d, J
2
n(n - 1)
l

Where,

d, = Difference between the ranks of the i'h pair of variables

n = Total p a i r of observation

Regression

Regression gives the linear relationship between two variables. U n l i k e correlation, it can

give the cause and effect of the relation. In this technique, the equation representing

the linear r e l a t i o n s h i p between the variables is considered a s :

Y = a + bX ................. (1)

Solving equation (1) and obtaining the values of a and b give the regression equation.

The two normal e q u a t i o n s to obtain a and bare:

LY= na+ bLX

2
i: x v = ai:x + b i: x

The regression l i n e is g r a p h i c a l l y represented in the Fig. 6.Sb .

----Regression Line

. . .. .



Fig. 6.Sb: Regression line

www.itmuniversityonline.org Page 99
Research Methodology

0 6 . T a b u l a t i o n a n d Analysis of Data eBook

These t e c h n i q u e s are used for only two variables but when more t h a n two variables are

to be compared, you can use multiple correlation, partial correlation, multiple

regression, etc. accordingly.

6 . 9 Association of Attributes

To study the relationship between two attributes, you have to use association of

attributes. For such association, Prof. Yule has defined a coefficient of association, which

is known as Yule's coefficient of association and is denoted by QAs, where A and B are

two attributes to be compared. It is given as:

( A B ) ( a b ) - (Ab)(aB)

QAe = ( A B ) ( a b ) + (Ab)(aB)

Where,

QAs = Yule's coefficient of association between attributes A and B

(AB) = Frequency denoting A and B a r e present

(Ab) = Frequency denoting A is present but B i s absent

(aB) = Frequency denoting A is absent but B i s present

(ab)= Frequency denoting both A and B a r e absent

The mentioned frequencies are shown in Table 6.9a which is a 2 x 2 contingency table.

Attribute -t
A a Total
,I.

B (AB) (aB) (B)

b (Ab) (ab) (b)

Total (A) (a) N

Table 6 . 9 a : Frequency Table for Attributes A and B

After computing the value of QAs, if the value of QAs = + 1, then there is a perfect

positive association between the attributes. If QAs = - 1 , then there is a perfect negative

correlation between the attributes and if QAs = 0, there is no association between the

attributes.

www.itmuniversityonline.org Page 100


Research Methodology

06. Tabulation and Analysis of Data eBook

6 . 1 0 T i m e Series Analysis and Index N u m b e r

Time Series

Data which is given with respect to a sequence of time are called time series data.

Example: The yearly or monthly sales of a departmental store, the monthly incentives

given to the employees in a sales department of a company, etc.

Thus, a time series data consists of a variable denoted as Y,, recorded at specified time

point, t. Time series is affected by four components, they are:

Secular variation or Trend (T)

Cyclical variation (C)

Seasonal variation (S)

Irregular or Random variation (I)

Secular variations are the variations in data when it is observed for a long period of

time. Thus, the effect of a trend is almost consistent throughout the period considered.

In cyclical variations, there is an oscillatory movement in the data. A b u s i n e s s cycle is a

good example of a cyclical variation as the cycle oscillates from prosperity to recession,

then to depression and finally, recovery, as shown in Fig. 6 . 1 0 a .

PEAK PEAK

P R O S P E RITY +

TROUGH

Four Phases of Business Cycle

Fig. 6 . 1 0 a : Cyclical Variation in a Business Cycle

www.itmuniversityonline.org Page 101


Research Methodology

06. Tabulation and Analysis of Data eBook

Seasonal variations are the variations that occur in a data seasonally. Like the sales of

flight tickets go h i g h d u r i n g holiday season and during the rest of the year, it is n o r m a l .

And lastly, i r r e g u l a r fluctuations are the variations that occur in a data randomly. Like, if

there is a natural calamity, like flood, earthquake, etc. or there is a strike or war, then

the variation in the data, d u e to such causes is called irregular variation.

Index Number

An index number is a device which shows, by its variation, the change in a magnitude

that is not capable of accurate measurement by itself or of direct valuation in practice

(Wheldom, Business Statistics). Thus, index number studies the change mainly in

economic activities in a period of time. For example, it studies the change in prices of a

commodity in two different situations (years).

Some of the most commonly used index numbers are Laspeyres method, Paasche

method, Fisher's ideal method. Index number is also called economic barometer because

it studies the change in different economic situations. It gives us an approximate idea of

the change but does not give an accurate result of the change.

www.itmuniversityonline.org Page 102


Research Methodology

06. Tabulation and Analysis of Data eBook

6 . 1 1 Chapter S u m m a r y

Tabulation is a useful representation of data to summarize a data for its

comparison and computation.

Measures of central tendency help to represent the characteristics of a data by a

single value. The measures of central tendency involve mean, median, mode,

harmonic mean, and geometric mean.

Measures of dispersion study the variability among the observations in a data

series. The main types of measures of dispersion are range, mean deviation, and

standard deviation.

The objectives of discriminant analysis are to classify objects, by a set of

independent variables, into two or more mutually exclusive and exhaustive

categories.

When there are two or more than two independent variables, the anal y s i s

concerning relationship is known as multiple correlation and the e quat i o n

describing such relationships is known as multiple regression eq u a t i o n .

If a variable Y, is studied at different points of time (t), the series so obtained is

called a time series.

The change in the economic condition over two situations is called an index

number.

www.itmuniversityonline.org Page 103


Hypothesis Testing
Research Methodology

07. Hypothesis Testing eBook

7 . 1 Introduction

Hypothesis testing is an important tool used in obtaining research inferences. Research

is carried out for new and advanced findings and hypothesis testing enables the

researcher to obtain this. It helps to draw conclusions about the population based on

sample observations.

Hypothesis testing also helps in decision making in the field of business and industry.

For example, the m a n a g e r of a garment factory wants to compare the outputs of two of

t h e i r factories in two different locations. Thus, to test the hypothesis, one must initially

proceed, considering that both factories have the same outputs. Finally, after the

application of statistical tools for testing the hypothesis, one can conclude whether the

hypothesis is true or not, on the basis of the result value. Also, if one wants to study the

average customers visiting a shopping centre, then by collecting a sample of the n u m b e r

of customers visiting the shopping centre for 10 days and, accordingly, after calculating

the result by a statistical technique, one can conclude the average n u m b e r of customers

visiting the s h o p p i n g centre.

After reading t h i s chapter, you will be able to:

Define hypothesis

State the different types of hypothesis

Discuss some terminologies of hypothesis testing

Describe the procedure of testing of hypothesis

Discuss parametric and non-parametric tests for hypothesis testing

Discuss hypothesis testing for mean of single sample and compare two means of

two samples

Discuss hypothesis testing for variance

Discuss hypothesis testing for simple, partial, and multiple correlation coefficients

www.itmuniversityonline.org Page 105


Research Methodology

07. Hypothesis Testing eBook

7 . 2 Hypothesis

Hypothesis is a statement about population parameters. The hypothesis is constructed in

a m a n n e r that completely reflects the research problem. These statements are based on

some a s s u m p t i o n s and d e p e n d i n g on these assumptions, the results are o b t a i n e d .

According to Robert B. Burns and Richard A. Burns, "A hypothesis is a hunch, an

educated guess, a proposition that is empirically testable." They also stated that,

"It possesses three essential steps:

1. The proposal of a hypothesis or tentative assumption to account for a

phenomenon or test the validity of some situation.

2. The deduction from the hypothesis that certain phenomena should be

observed in given circumstances.

3. The checking of this deduction by observation and testing."

Usually, there are relational hypotheses denoting some relationship between the

variables under study, hypotheses about the differences between groups and also,

hypotheses about differences of a particular group from some standard group/measures.

The characteristics of a hypothesis are:

A hypothesis should be simple, clearly understandable, and subject-specific, in

order to give a consistent fi n d i n g .

Hypothesis should be capable of undergoing statistical tests to scientifically obtain

accurate resu Its.

In case of a relational hypothesis, the hypothesis states the relationship between

the variables.

Hypothesis should be limited in scope and must be specific.

A hypothesis should not conflict with known facts.

It should be tested within a particular period of time. It should not be so complex

that, for its testing, an excessive amount of time will be required.

Hypotheses that sometimes involve facts should be explained properly.

www.itmuniversityonline.org Page 106


Research Methodology

07. Hypothesis Testing eBook

7 . 3 Types of Hypothesis

Hypotheses are of two types: N u l l hypothesis and alternative hypothesis.

Null Hypothesis

The null hypothesis is a statement of no difference, that is, a hypothesis that states that

there is no difference between study variables.

According to Prof. Fisher, "A null hypothesis is the hypothesis which is tested for

possible rejection under the assumptions that it is true." It is denoted by H


0

For example, if you want to test whether two samples from a population are s i m i l a r or

not, in such a case, the null hypothesis can be stated a s :

H0 : There is no difference between the two samples.

Also, if the average height of the students in a college is said to be 5.4 feet, where the

height is normally d i s t r i b u t ed with m e a n , then the null hypothesis is given a s :

H 0 :
= 5.4feet

Alternative Hypothesis

Alternative hypothesis is a statement which is the opposite of null hypothesis. Thus, a

statement with difference is called an alternative hypothesis. It is denoted by H, or HA or

H,.

For example, the alternative hypothesis for a null hypothesis of the type

H0 : = 5 . 4 feet is H, : "' 5 . 4 feet or H, : < 5 . 4 or H, : > 5.4

Here, H, : "' 5 . 4 Inch is a two-tailed alternative hypothesis.

While, H, : < 5 . 4 and H, : > 5.4 are one-tailed alternative hypotheses (left tailed and

rig ht t a i l e d , respectively).

www.itmuniversityonline.org Page 107


Research Methodology

07. Hypothesis Testing eBook

7 . 4 T e r m i n o l o g i e s Used in Hypothesis Testing

Population

The population is an aggregate of all the items or individuals. For example, the collection

of a l l individuals in a locality is called the population of the locality. All computers in an

office, collectively, are an example of a population. Usually, population is denoted by S .

Each of the items or individuals in a population is called a population unit. Thus, in a

population S, if there are N numbers of units, then N is termed as the population size.

A population can be finite or infinite, depending on its size. For example, the population

of private sector employees in India is an infinite population, while the population of

employees at a particular company is a finite population.

Parameter

The parameters are characteristics of the population which define the population. The

2,
population parameters are mean, denoted by the symbol , variance denoted by cr etc.

Sample

A sample is the subset of a population. For example, if from a population of 1 0 0 0 books

on fiction, academics, and short-stories, only books on academics are selected, it is a

sample drawn from the population of 1000 books.

The size of the sample is the total number of items or individuals in a sample and is

denoted by n. A sample is very useful when the population is very large and studying

the p o p u l a t i o n w i l l involve more money, time, and effort. In such a case, it is convenient

to take a sample from the given population, which is a representative of the p o p u l a t i o n .

Statistic

As the parameter defines characteristics of population, likewise, statistics defines

characteristics of a sample. Statistics, like mean and variance, are represented by

different symbols to distinguish between parameters and statistics, that is, mean is

denoted by x and variance is denoted by s


2
, respectively.

Simple Hypothesis and Composite Hypothesis

A simple hypothesis is a hypothesis that completely specifies the population, while a

composite hypothesis partially specifies the population.

www.itmuniversityonline.org Page 108


Research Methodology

07. Hypothesis Testing eBook

For example, if the weekly sales of an apparel brand for a year are normally distributed

2
with mean and variance cr , then, the simple hypothesis is g i v e n a s :

2
H0 : = 0 and cr = cr

And the composite hypotheses are given as:

a) Ho : = o

2
b) H 0
: = 0
, cr > cr

c) H, : " 0 , cr " cr , etc.

Test Statistic

After the formulation of the hypothesis, the next step is to test the hypothesis, that is,

to accept or reject the null hypothesis. To test the population characteristics under

study, it is not possible to obtain observations from the whole p o p u l a t i o n , so a sample is

selected. Based on t h i s sample the hypothesis is tested, that is, statistics of the sample

are involved in the computation and a test statistic is a function of these statistics.

For example, z = :;?n is a test statistic. Here, z is a function of the sample mean ( x ),

assumed mean (0), standard deviation ( cr ) , and sample size n.

Thus, the test statistic is calculated from the sample and its value is used in decision

making, whether the null hypothesis is to be accepted or rejected. The choice of an

appropriate test statistic depends on the hypothesis formulated and the population

distribution.

Critical Value

A critical value for a hypothesis test is the value to which the value of the test statistic is

compared and the d ec i s i o n of accepting or rejecting a hypothesis is taken. The critical

value varies according to the level of significance of the hypothesis and two-sided or

one-sided test.

Type-I Error and Type-II Error

While testing a hypothesis, two types of errors can be committed, namely, type-I error

and type-II error. Type-I error is committed if the null hypothesis is rejected, when it is

in fact true and type-II error is committed, if the null hypothesis is accepted, when it is

in fact false.

www.itmuniversityonline.org Page 109


Research Methodology

07. Hypothesis Testing eBook

P(type-I error) = o. and P(type-II error) = 13

These errors can be represented in a tabular form as shown in Table 7 .4a.

Decision H0 is true H0 is false

Reject H0 Type-I error Correct decision

Accept H0 Correct decision Type- II error

Table 7.4a: Type-I and Type-II Error

Level of Significance

The level of significance is a fixed value which indicates the a m o u n t of correctness in the

conclusion drawn from the hypothesis testing. It is denoted by a , so it is g i v e n a s :

P(type-I error) = c,,

If a = 0 .05, then the level of significance is 5/o, that is, the probability of rejecting a

true H0 is 5%, w h i l e to accept a true H0 is 95/o (100/o- 5/o = 9 5 % ) .

p-value

Probability value is termed as p-value. It is a value that assumes the value of a test

statistic when the null hypothesis is true. If the p-value in a hypothesis testing is less

than the already decided significance level, the difference is significant. A smaller p

value indicates that the n u l l hypothesis is less likely to be true.

Power of the Test

The power of a hypothesis test is the probability of not committing a type-II error. Thus,

the power of a hypothesis test is one minus the probability of accepting the null

hypothesis when it is false, that i s : Power of the t e s t = 1 - P(type - II e r r o r ) = 1 - 13

Acceptance Region and Rejection Region

The acceptance region is the region formed by the values of test statistics under

consideration, in which the null hypothesis is accepted, that is the sample space of a l l

the values of the test statistic is divided into two regions, acceptance region and

rejection reg ion. If the calculated value of the test statistic lies in the acceptance reg ion,

then the n u l l hypothesis is accepted; whereas, if the calculated value of the test statistic

lies in the rejection region, then the null hypothesis is rejected. The rejection reg ion is

also known as the critical region. Fig. 7.4a shows the acceptance and rejection region in

w h i c h the level of significance is cc .

www.itmuniversityonline.org Page 110


Research Methodology

07. Hypothesis Testing eBook

H, : < o
Acceptance

Region

Cl

0 + oo

Critical Region

Fig. 7 .4a: Acceptance and Critical Region for Left Tailed Test

One-sided Test and Two-sided Test

To e x p l a i n one-sided and two-sided test, consider the following example:

Suppose the mean of a normally distributed population is and


0
is a fixed value. The

n u l l and alternative hypothesis is given as: H0 : =


0
and H, : ctc 0

Thus, the alternative hypothesis means, either > 0 and <


0
.

Therefore, the critical reg ion is located on one tail of the probability distribution, as

shown in Fig. 7 .4a. This is called a one-tailed test.

In Fig. 7 .4a, the critical reg ion is located on the left side of the distribution and is

termed as left-tailed test but when the critical region is located on the right side of the

distribution, it is termed as a right-tailed test. Fig. 7.4b shows a right-tailed test.

Acceptance

Region

Cl

- 00 0

Critical Region

Fig. 7 .4b: Acceptance and Critical Region for Right Tailed Test

www.itmuniversityonline.org Page 111


Research Methodology

07. Hypothesis Testing eBook

For H, : a< , the critical region falls on both sides of the d i s t r i b u t i o n and is termed as
0

a two-sided test.

Acceptance

Region

aJ2 aJ2

Cntical Critical

Region Region

Fig. 7 . 4 d : Acceptance and Critical Region for Two-sided Test

7 . 5 Procedure of Testing of Hypothesis

The procedure for testing a hypothesis is:

1. Formulation of the hypothesis

The first step is to formulate the null hypothesis and the alternative hypothesis in

a m a n n e r that it reflects the purpose of the study.

2. Decide the distribution

Once the sample is collected, the sampling distribution is obtained. Also, an

appropriate test statistic for the test is decided.

3. Select the level of significance

The next step is to select an appropriate level of significance. Usually, 5% and 1%

level of significance ( a ) is considered.

4. Computation of the value of the test statistic

The v a l u e of the test statistic is computed with sample observations.

5. Obtain the critical value

The critical value is obtained, depending on the level of significance, according to

the test statistic.

www.itmuniversityonline.org Page 112


Research Methodology

07. Hypothesis Testing eBook

6. Comparing the calculated value with the critical value

Compare the calculated value of the test statistic with the critical value. If the

calculated value is less than the critical value, then the null hypothesis is accepted,

otherwise, it is rejected. If the calculated probability is e q u a l to or s m a l l e r t h a n a

value in case of one-tailed test (and a/2 in case of two-tailed test), then accept

the null hypothesis, but if the calculated probability is greater, reject the null

hypothesis.

7.6 Parametric and Non-parametric Testing

For hypothesis testing, the conclusion of accepting or rejecting a n u l l hypothesis is based

on the value of the test statistic. There are many statistical tests used for hypothesis

testing. These tests are of two types: Parametric and non-parametric tests.

Parametric test involves some assumptions about the population considered. While in

the case of a non-parametric test, it does not involve any such a s s u m p t i o n s . Thus, when

information about the population is not available, the non-parametric test is appropriate.

2
Some parametric tests are z-test, t-test, x test, F-test, etc. The non-parametric tests

are s i g n test, run test, Wilcoxon matched pairs test, Kruskal-Wallis test, etc.

Parametric tests for hypothesis testing that are based on the assumption of normal

d i s t r i b u t i o n are:

z-test

It is based on the assumption of normality and is applicable for testing the

significance of several measures, like mean, median, mode, coefficient of

correlation, etc. It is used to compare the mean of a large sample to a

hypothetical mean, in order to test the significance of difference between means of

two samples, in case of a large sample and also, when the variance is known. It is

useful for samples with a size greater than 30.

t-test

The statistic used f o r t - t e s t was introduced by W . S . G o s se t in 1908 and termed as

student's t statistic. Like the z-test, t-test is also used to test the significance of a

sample mean or the difference between two sample means, when the variance of

www.itmuniversityonline.org Page 113


Research Methodology

07. Hypothesis Testing eBook

the population from which the sample is drawn is not known. Also, for paired

samples, it is used to test the independence between samples.

2
Chi-square ( x ) test

This test is used to compare a sample variance with a theoretical population

variance.

F-test

This test is used to compare the two independent samples in ANOVA. It is also

used to compare more than two samples at a time and for testing the

homogeneity of variance of two normal populations.

Some of the non-parametric tests for hypothesis testing are:

Sign test

It is the simplest of all the non-parametric tests. In this test, the values of the

observations are replaced b y ' + ' o r ' - ' sign to the direction it is moving towards or

away, from a hypothetical value, respectively. Therefore, it is termed as s i g n test.

Run test

The word 'run' here denotes the sequence or series of symbols w h i c h are followed

or preceded by a different symbol or no other symbol. A run test is used to verify

whether there is any randomness among the observations in a g i v e n data.

Wilcoxon matched-pairs test

When you have data for two samples which are paired, the Wilcoxon matched

pairs test is applicable. This test is used to make inferences about the difference

between two p o p u l a t i o n s.

Kruskal-Wallis test

This test is used to compare more than two populations for continuous data. In

t h i s test, a l l g r o u p observations under consideration are ranked and then, one-way

AN OVA is used, with the ranks as the values of the observations for each g r o u p .

www.itmuniversityonline.org Page 114


Research Methodology

07. Hypothesis Testing eBook

7 . 7 Testing of Hypothesis for Mean

Hypothesis testing for mean of single sample and for difference between mean is:

Testing for mean of single sample

Case 0 1 :

2
When the population is infinite and normally distributed but the variance, cr , of the

population is known. Also, the sample size is denoted by n.

Here, H0 : =
0

Where,

, = A hypothetical value

Then, for a one-sided or two-sided alternative hypothesis, the test statistic applied is

X - 0
given a s : z =
CT I ,,[ri

And z follows standard normal distribution with mean O and variance 1.

Case 0 2 :

2
When the population is finite and normally distributed but the variance ( cr ) of the

population is known with null hypothesis: H0 : =


0
and alternative hypothesis is one

sided or two-sided.

X -
Then, test statistic z is given as: z - 0

- (cr/.fri)RcN-n)/(N-1)]

Case 0 3 :

When the population is infinite and normally distributed but the variance (a') of the

population is u n k n o w n .

Since the population variance is not known, sample standard deviation is used as an

estimate of the population standard deviation, a


5
= L (x, - X)'
n-1

0
And the test statistic used to test the hypothesis is t - X - with (n - 1) degrees of
- (JS ; ,,fri '

freedom or df.

www.itmuniversityonline.org Page 115


Research Methodology

07. Hypothesis Testing eBook

In case, the population is finite, the test statistic is given a s :

t - X - o

- {cr, I Jn)x [(N- n ) / ( N - l)j

Example 0 1 :

A sample of 400 male students is found to have a mean height 67.47 inch. Can it be

reasonably regarded as a sample from a large population with mean height 67.39 inch

and standard deviation 1 . 30 inch? Test whether the sample is drawn from the given

population at 5% level of significance.

Solution O 1 :

Taking the null hypothesis that the mean height of the population is e q u a l to 6 7 . 3 9 inch,

we can write:

H0 : H, = 67 . 3 9 "

H, : H, 7' 67.39'

Therefore, the given information is written as X = 6 7 . 4 7" , cr P = 1.30", n = 400. Assuming

the population to be normal, we can work out the test statistics 'z' as follows:

z = x - Ho = 67.47 - 67.39 = 0.08 = 1.231

crp I ../n 1.30/.J400 o.065

As H, is two-sided in the given question, we shall be applying a two-tailed test for

determining the rejection regions at 510 level of significance w h i c h , using normal curve

area table is as follows:

R : I > 1 . 96

The observed value of z is 1.231, which is in the acceptance region, since R : I > 1.96

and thus, H is accepted. We may conclude that the given sample (with mean height =
0

= 67.47') can be regarded to have been taken from a population with mean height

6 7 . 3 9 " and standard deviation 1.30" at 510 level of significance.

Source: Kothari .C.R., Research Methodology, Methods and Techniques, New Age International Publishers,

New Delhi, 2nd Edition

www.itmuniversityonline.org Page 116


Research Methodology

07. Hypothesis Testing eBook

Testing for difference between means

To test the n u l l hypothesis in different cases, the test statistics are:

Case 04:

2 22
When the population variance (cr 1 , cr ) is known and the samples are large. Then, to

test the hypothesis:

Ho : , = ,

Where,

,,
2
= Population means of two separate populations from which samples are d r a w n .

Then the test statistics z is given a s : z =

Where,

X1 and X2 = Means of the samples of size n, and n2

22
If the p o p u l a t i o n variances cr2i, cr are not known, sample variances are used.

I(x" - x , j (x2, - x,j


(n, - 1) (n
2
-1)

Case O S :

If large samples are drawn from the same population with known variance, then the test

. . . . x,-x.
statistic z rs given a s : z = ---;=========

cr(_!_ + _!_)
n, n2

However, in case the population variance is not known, combined sample standard

n,(cr;, + dn + n,(cr;, + d)

deviation ( cr , ) is used and is given a s : o =


a

Where,

d, = (X, - X12)

d, = (X2 - X12)

X, = n,X, + n 2 X2
2
n1 + n2

www.itmuniversityonline.org Page 117


Research Methodology

07. Hypothesis Testing eBook

Case 0 6 :

For small samples, if the population variances are unknown but assumed to be equal.

Then, the test statistic is given a s :

t = (x,-x,)
with df (n, + n, - 2)

L (x" - x,j + L (x - x,j J x (_!_ + _!_)


[ n, + n2 - 2 n, n
2

Example 0 2 :

The mean score in a test of total marks 100 of two samples of size 80 and 100 students

are 61 and 55. Also, the standard deviations for the two samples are given as 2 and 1,

respectively. Test whether the two samples are drawn from the same population with

standard deviation 1.4. Use level of s i g n i f i c a n c e = 0.05.

Solution O 2:

The n u l l and the alternative hypothesis is stated as:

Ho : , = ,

H, : , " ,

From the given data we have, X,= 65 and X 2 = 55

S a m p l e s of size n, = 80 and

cr = 1.4

Here, the test statistic to be used is 'z' statistic and is given a s :

X 1 -X 2
z = ---;=========

cr2 (_!_ + _!_)


n, n2

6 5 - 55
=

(1.4)2( _.!.._ + _1_)


80 100

5
=

(l.96{810 + 1 0 )

5
=
(1.96)(0.0225)

= 23.8095

www.itmuniversityonline.org Page 118


Research Methodology

07. Hypothesis Testing eBook

Using normal curve area table, the critical region for 510 level of significance is lzl > 1.96

Therefore, the calculated value of z falls on the rejection region. Thus, the null

hypothesis is statistically significant at 510 level of significance. Therefore, we may

conclude that the two samples are not drawn from the same p o p u l a t i o n .

7 . 8 Testing of Hypothesis for Variance

When the hypothesis is tested for variance there can be two cases:

Va ria nee of sing le sample

E q u a l i t y of variances of two normal populations

Variance of a single sample

When a sample is drawn with variance a; from a population with variance ", the null

hypothesis is g i v e n a s : H0 : cr; = a

The test statistic i s : x' = a; (n - 1 ) , which follows chi-square d i s t r i b u t i o n with (n - 1) df.


(}" p

Equality of variances of two normal populations

When two p o p u l a t i o n s are to be compared for equality of variances, the null hypothesis

is given a s : H0 : cr/ = a;
cr2

And the test statistic is given a s : F =-- 2


cr s,

Where,

(J ,, =

L(x - x,)2
,,
(J =
(n, - 1)

If the calculated value of F is greater than the table value of F, at a certain level of

significance, for (n1 - 1) and (n2 - 1) degrees of freedom, regard the F-ratio as

significant.

www.itmuniversityonline.org Page 119


Research Methodology

07. Hypothesis Testing eBook

Example 0 3 :

If two samples are drawn from two normally distributed populations as g i v e n below, test

whether the two p o p u l a t i o n s have similar variance at 5/o level of significance?

Sample 1 4 6 3 8 10 11 6 7 2 12 17 16

Sample 2 12 14 18 19 15 10 11 16 21 20 18 9 13 1 1 7 1
. . .
Table 7.Sa: D1stnbution of the Two Samples

Solution 0 3 :

Here, the n u l l hypothesis is given a s : H0 : a'i = if2

2
Since the p o p u l a t i o n variance is not known, we use sample variances a's, and 0 ,,

The test statistics to be used is F statistics and is given a s :

Sample 1 Sample 2

(xi, - x1f (x21 - x2f


XI x,

4 12 20.25
10.3298

6 14 6.25
1.473796

3 18 30.25
7. 7 6 1 7 9 6

8 19 0.25
14.3338

10 15 2.25
0.045796

11 10 6.25
27.1858

6 11 6.25
17. 7578

7 16 2.25
0.617796

2 21 42.25
33.4778

12 20 12.25
22. 9058

17 18 72.25
7.761796

16 9 56.25
38.6138

13
4. 9 0 1 7 9 6

17
3.189796

102 213 269.25 190.3571

Table 7 . S b : Distribution of the Two Samples

www.itmuniversityonline.org Page 120


Research Methodology

07. Hypothesis Testing eBook

S a m p l e mean and variance for the two samples are as g i v e n :

11
X , = l: X = 1 0 2 = 8 . 5

n, 12

213
X, = L x,, = = 15.214
n, 14

0' =
"(X
L, 11 -
x)'
1 = 269.25 - 24.477
ei (n,-1) 1 2 - 1

02
= 2: (x,; - x;J 190.357 =
14_643
sa (n, - 1) 14 - 1

F = 0, = 23.364 = 1 . 6 7 2
The value of the test statistic i s :
2
0 14.643
sa

The tabulated value of F statistic at 5/o level of significance for ( 1 2 , 1 3 ) df is 2 . 6 0 . Thus,

the calculated value (1.672) is less than the tabulated value (2.60), so we accept the

n u l l hypothesis at 5% level of significance. Hence, we can conclude that the two samples

have been drawn from two populations with the same variance.

7 . 9 Testing of Hypothesis for Correlation Coefficients

If a sample of 'n' pairs of observations (x, y) from a normal population, 'r' is the

correlation coefficient between X and Y, and the population correlation is 'p', the null

hypothesis i s : H0 : p = 0

Testing of significance of simple correlation coefficient

r .
In case of s i m p l e correlation coefficient, the test statistic is: t = with df (n - 2)

1- r'

This calculated value is then compared with its tabulated value for a specific level of

significance. If the calculated value is less than the tabulated value, the null hypothesis

is either accepted or rejected.

www.itmuniversityonline.org Page 121


Research Methodology

07. Hypothesis Testing eBook

Testing of significance of partial correlation coefficient

rP
In case of partial correlation coefficient (rp), t = --'----=

Ji- r:

Where,

n = N u m b e r of paired observations

k = N u m b e r of variables

If the tabulated value o f t , for (n - k)df, is greater than the calculated value, we accept

the null hypothesis for a specific level of significance that there is no partial correlation

coefficient.

Testing of significance of multiple correlation coefficient

If the multiple correlation coefficient is denoted by R, then the test statistic applicable

R A!< - 1 )
here is F-statistics and is given as: F-
- (l- R2 V
/(n- k)

Where,

k = N u m b e r of variables involved

n = N u m b e r of paired observations

If the tabulated value of F is obtained for ( k - 1, n - k )df at a% level of significance is

less than the calculated value of F, the null hypothesis is rejected at a% level of

significance.

7 . 1 0 L i m i t a t i o n s of Testing of Hypothesis

Hypothesis testing is only a technique to help in d e c i s i o n - m a k i n g .

It only e x p l a i n s whether the null hypothesis is true or false but it does not e x p l a i n

why the hypothesis is accepted or rejected. Thus, it fails to give the cause of the

acceptance or rejection.

The result obtained from the computation of a test statistic is compared with the

critical values w h i c h are probability values.

All significance tests cannot be considered as accurate measures, with respect to

the formulated hypothesis.

www.itmuniversityonline.org Page 122


Research Methodology

07. Hypothesis Testing eBook

7 . 1 1 Chapter S u m m a r y

Hypothesis is usually considered an essential tool in research. Its m a i n function is

to suggest new experiments and observations.

A statement made about the population parameter is known as hypothesis.

A hypothesis where there is no difference between two situations, groups,

outcomes or the prevalence of a condition or phenomenon is called a null

hypothesis.

A hypothesis that is an opposite of the null hypothesis is called an alternative

hypothesis. It is also known as hypothesis of difference.

In parametric tests procedure, assume that the data has come from a type of

probability distribution.

Non-parametric tests are often used in place of their parametric counterparts,

when certain a s s u m p t i o n s about the underlying population are q u e s t i o n a b l e .

www.itmuniversityonline.org Page 123


A n a l y s i s of

Variance

( A N OVA)
Research Methodology

08. Analysis of Variance (ANOVA) eBook

8 . 1 Introduction

As discussed in the previous chapter, t-test is used to compare the means of two

samples to see if there is a significant difference between them. However, if an

experiment involves more than two sets of data, it would be time c o n s u m i n g to compare

the results. In case of agriculture application, you must test more than two sa m p l e s to

study the influence of various factors, such as variation in seed quality, effect of

fertilizers on the types of seeds, etc. in such a situation, analysis of variance can be

applicable.

Analysis of variance, most commonly known as ANOVA, is one of the main statistical

techniques used to test differences between two or more means. ANOVA means

'Analysis of Variance', rather than 'Analysis of Means' because inferences about means

are made by analyzing the variance. With ANOVA, you can analyze data from several

i n d e p e n d e n t variables, simultaneously.

Analysis of variance for experimenting with only one factor is called 'one-way ANOVA'

and for experimenting with two factors, it is called 'two-way ANOVA'. In t h i s method, the

effect of a factor is tested by calculating the F-ratio, where a separate F-ratio is

computed for each factor in the experiment, which is a test for main effects. ANOVA

method doesn't depend on the number of levels of each factor. ANOVA is available for

both, parametric (score data) and non-parametric (ranking/ordering) data.

After reading t h i s chapter, you will be able to:

State the meaning of Analysis of Variance (ANOVA)

Describe v a r i a b i l i t y measure by one-way ANOVA

Describe the method of one-way ANOVA technique

Describe the method of two-way ANOVA technique

Describe the method of Analysis of Covariance (ANOCOVA) t e c h n i q u e

www.itmuniversityonline.org Page 125


Research Methodology

08. Analysis of Variance (ANOVA) eBook

8.2 A n a l y s i s of V a r i a n c e CANOVA)

According to Prof. R. A. Fisher, "Analysis of variance (ANOVA) is the separation

of variance ascribable to one group of causes from the variance ascribable to

another group." By this technique, total variation in the sa m p l e data is expressed as

the sum of its non-negative components, where each of these is a measure of the

variation, d u e to some specific independent source or factor or cause.

Analysis of variance (ANOVA) involves investigation of the effects of one treatment

(independent) variable on an interval-scaled dependent variable. To check whether the

difference in means between two or more groups are statistically significant, which is

practically difficult to solve by z-test or t-test, in such a case, the hypothesis testing

t e c h n i q u e ANOVA is u s ed .

Using AN OVA technique, you can decompose the total variability found w i t h i n a data set

into two components that are random and systematic factors. The random factors do not

have any statistical influence on the given data set, while the systematic factors d o . The

ANOVA test is used to determine the impact of independent variables on the d e p e n d e n t

variable in a regression analysis.

Examples:

In case you have data on a student's performance in non-assessed assignments,

as well as, t h e i r final grading and if you are interested in finding out whether the

performance in the assignment is related to the final grade obtained. Using

AN OVA, you can break up the group according to the grade and see if the relation

of performance is different across these grades.

Consider that a manager wants to find whether the location has an effect on the

profit of an apparel retail business having the following alternatives for location,

namely, stand-alone shop, shop in a shopping centre, and o n l i n e delivery system.

Here, location is the only independent variable (IV) and the profit/loss is the

dependent variable ( D V ) . In such a case, the t-test would not be appropriate. One

way AN OVA would be the choice for analysis.

www.itmuniversityonline.org Page 126


Research Methodology

08. Analysis of Variance (ANOVA) eBook

Assumptions in ANOVA

Ordinarily, the categories of the independent variable are assumed to be fixed.

This type of model is known as fixed effect model.

The error terms or variation within samples is normally d i s t r i b u t e d , with a constant

variance and zero mean. The error is not related to any of the level of X.

The error terms or variation within samples are uncorrelated. If the error terms

are correlated (that is, the observations are not independent), then F-ratio can be

seriously distorted.

ANOVA must have a Dependent Variable (DV) that is metric and also, one or more

Independent Variables (N) that are all categorical. Measurable variables, such as h e i g h t ,

income, and age are called metric variables.

According to Gudmund R. Iversen, Mary Gergen, and Mary M. Gergen, "a metric

variable is not metric in the sense that the metric system but in the sense that

its values can be numerically measured."

In t h i s ANOVA technique, factors are the categorical independent variables. Treatment is

nothing but a particular combination of factor levels or categories. In the previous

example of retail business, profit/loss is DV and three locations are treatment for a

location as a factor.

Relationship Among Techniques

One-way ANOVA involves only one categorical variable or a s i n g l e factor. There w i l l be

various levels for the single factor. If more than two factors are involved, the analysis is

termed as n-way ANOVA.

Professor Snedecor and many others contributed to the development of ANOVA

technique. ANOVA is, essentially, a procedure for testing the difference among different

groups of data for homogeneity. According to Prof. Snedecor, "The essence of

ANOVA is that the total amount of variation in a set of data is broken down into

two types, that amount which can be attributed to chance and that amount

which can be attributed to specified causes."

There may be variation within sample items and between samples. Using ANOVA, you

can s p l i t the variance for analytical purposes. Therefore, it is a method of analyzing the

variance to which a response is subject into its various components, corresponding to

www.itmuniversityonline.org Page 127


Research Methodology

08. Analysis of Variance (ANOVA) eBook

various sources of variation. Using ANOVA technique, you can explain whether varieties

of seeds or fertilizers or soils differ significantly. Similarly, differences in various types of

feed prepared for a particular class of animal or various types of d r u g s manufactured for

curing a specific disease, may be studied and judged to be significant or not t h r o u g h the

a p p l i c a t i o n of ANOVA technique.

8 . 3 Why Analyze Variance?

Consider two different experiments, with their distribution pattern as below.

Experiment - 1 Experiment - 2

) \

,o -10 -s O s 10 IS 20
0 2 4

Within group variability for each Three groups have approximately

group is relatively small. the same mean, unlike experiment

It is easy to see that there is a 1, and the variability within each

difference between the means of group is m u c h larger.

the three g r o u p s . It is not easy to see the difference

between the means of the three

groups.

Hence, to differentiate the groups in experiment 2, the variability between the groups

must be greater than the variability within the groups. If the v a r i a b i l i t y w i t h i n the g r o u p s

is large compared to the variability between the groups, any difference between the

groups is d i ff i c u l t to detect. Variability between the groups and variability within the

groups are compared to determine whether or not the group means are significantly

different.

Using ANOVA technique, one can investigate any number of factors that influence the

dependent variable. Also, one may investigate the differences amongst various

www.itmuniversityonline.org Page 128


Research Methodology

08. Analysis of Variance (ANOVA) eBook

categories within each of these factors, which may have a large number of possible

values.

If you take only one factor and investigate the differences amongst its various

categories having numerous possible values, you are said to use one-way ANOVA and in

case we investigate two factors at the same time, we use two-way ANOVA. In a two or

more way ANOVA, the interaction (that is, inter-relation between two independent

variables/factors), if any, between two independent variables affecting a dependent

variable can also be studied for better decisions.

8 . 4 V a r i a b i l i t y M e a s u r e by One-way ANOVA

Differences among the means of the population are tested by analyzing the amount of

v a r i a b i l i t y w i t h i n the g r o u p , relative to the amount of variation between the g r o u p s .

In ANOVA, the v a r i a b i l i t y is decomposed into two, as shown in Fig. 8.4a.

Total

variability in

DV

Fig. 8.4a: Decomposition of Total Variability

Variability within the groups

In terms of variation within the given population, it is assumed that the values of i'h

observation of i'h group (Y, (where, i and j are positive integers excluding zero) differ
1)

from the mean of this population only because of random effects, that is, there are

influences on (Y 1
i) that are unexplainable.

It represents an estimate of population based on within sample variance that is

u n e x p l a i n a b l e , t h u s , an error in observations.

www.itmuniversityonline.org Page 129


Research Methodology

08. Analysis of Variance (ANOVA) eBook

Variability between the groups

In terms of variation (differences) between population, assume that the difference

between the mean of the /h group and the grand mean is attributable to what is called a

'specific factor' or what is technically described as treatment effect.

Two estimates are compared with the F-test for the given degrees of freedom and level

of significance. The F statistic in the ANOVA is:

F = Estimate of p o p u l a t i o n variance based on between sample variance

Estimate of population variance based on within sample variance

In AN OVA, the F-test is used to test the null hypothesis, which is stated as:

Ho : , = 2 = 3 = = ,

For a large value of F statistic, the greater the likelihood that the differences between

means are d u e to the treatment or something other than the chance alone, that is, the

means are significantly different from each other.

Thus, you have to accept the alternative hypothesis, which states that at least one of

the sample mean is significantly different from the rest of the means.

H, : , " 2 " 3 " . . . . " ,

www.itmuniversityonline.org Page 130


Research Methodology

08. Analysis of Variance (ANOVA) eBook

8.5 ANOVA T e c h n i q u e

The steps for ANOVA technique are displayed in Fig. 8.Sa.

I d e n t i fy the dependent and i n d e p e n d e n t v a r i a b l e s

Decompose the total variation

M e a s u r e t h e effects
)

Test the significance


)

Interpret the results


)

Fig. 8.Sa: The Steps for ANOVA Technique

Layout for one-way ANOVA

Consider that N observations x,1 (i = 1, 2 ... k; j = 1, 2 ... n.) of a random variable X are

k
grouped into k classes of sizes n i . n2 ... nk, respectively, (N = n,) as shown in Table

i = 1

8.Sa.

Means Total

-
X11 x,, . . .
X1n1 x, . T,

x,, x,, ...


X2n
2
x, . T2

' '
'

X,1 X,2 . . .
X1n, x, r,

x" x,, ...


xknk x, . T,

Table 8 . S a : Layout for One-way ANOVA

www.itmuniversityonline.org Page 131


Research Methodology

08. Analysis of Variance (ANOVA) eBook

8.6 One-way ANOVA - Example

Example 0 1 :

Three machines A, B, and C are tested to see whether their outputs ( n u m b e r of items

produced) are e q u i v a l e n t . The following observations of output are m a d e :

Machine A 12 10 11 13 14 15

Machine B 11 8 12 10 13

Machine C 10 11 14 15 12 13

Table 8 . 6 a : Output of Three Different Machines

Carry out the analysis of variance and state your conclusions.

Solution O 1 :

Step 1: Identifying IV and DV

For the given example,

Independent V a r i a b l e : Number of items produced X(IV)

Treatment/Factors: There are three types of machines Y(DV)

Step 2: Decompose the Total Variation

The total variation in Y, denoted by SSy (SS: Sum of Squares), is decomposed into two

components SSbetween and SSw1thin

Where,

Total Variability

SSy = Total of (Observed value - Grand mean)?

n, c

ssv = I I <Yi1 - Yl'


i=l J=l

Also, here R . S . S . ( R a w sum of squaresj s, IIY,i',

SSy = Total S . S . = R.S.S. - c.f.

Within Group

2
SSwithin = Total of (Group observed value - Group mean)

www.itmuniversityonline.org Page 132


Research Methodology

08. Analysis of Variance (ANOVA) eBook

Between Group

SS between = Total of [ng,oup x (Group mean - Grand mean)2]

y2
Between sum of squares = I-' - c. f .
i n,

Within S . S . = Total S . S . - Between S . S .

,,

I\
y = c=, ( M e a n for category/group j)

' nJ

,, c

I IY,j
1
Y = ,=, ; (Mean over the whole sample or the grand mean)

Here, we set u p the n u l l hypothesis as:

Ho: Various m a chi n e s are homogeneous

That is,
1
=
2
=
3
=

Where, , = Mean output from ith machine (i = 1, 2, 3)

For the given e x a m p l e :

Machines Output Y;.


I v.;J 2

A 12 10 11 13 14 15 75 955

B 11 8 12 10 13 54 598

c 10 11 14 15 12 13 75 955

Total 204 2508

Table 8 . 6 b : Output of Three Different Machines

www.itmuniversityonline.org Page 133


Research Methodology

08. Analysis of Variance (ANOVA) eBook

R.S.S. (Raw S u m of S q u a r e s ) = LLY/ = 2508, G = c.f. = {l: Y , J J = 2448

SSy = Total S . S . = R.S.S. - c.f. = 2508 - 2448 = 60

2 2 2 2
y 75 54 75
B e t w e e n s u m o f sq u a r e s = L-'- - c. f . = (-+-+-)-2448=10.2
, n, 6 5 6

Within S . S . = Total S . S . - Between S . S .

= 6 0 - 1 0 . 2

= 49.8

Step 3: Measure the Effect

The strength of the effects of X (IV) on Y (DV) is measured as follows:

2 ssbetv.een ss, - sswithin

YI = SS = SS
y y

2
The value of ri varies between O and 1.

For the given example,

2 = ssbe,_a = 10.2 = 0.17

YI ss, 60

In other words, 17% of the variation in the defect rate is accounted due to types of

machines.

Step 4: Test the Significance

In one-way ANOVA, the interest lies in testing the null hypothesis that the category

means are e q u a l in the population.

Under n u l l hypothesis:

Ho : , = , = 3

Assume that the variation between the samples and within the samples come from the

same source of v a r i a t i o n .

The null hypothesis may be tested by the F statistic based on the ratio between the two

estimates as follows:

www.itmuniversityonline.org Page 134


Research Methodology

08. Analysis of Variance (ANOVA) eBook

ssbetwee,/
F _ /(c - 1 ) _ MSbetwee,

- SSw1t111n/ - MSw1t111n
/(N - c)

MS = SSw,thi, = Mean square variation within the samples and (N - c) represents


wlthln (N _ c)

degrees of freedom w i t h i n the samples.

MS SSbetwee, = Mean square variation between the samples and (c - 1)


between - (c _ 1)

represents degrees of freedom between the samples.

F-test statistic follows the F-distribution with (c - 1) and (N - c) degrees of freedom.

Refer to the F-distribution table for the value of Fcr,ucal for various levels of significance

( 0 . 0 5 , 0 . 0 1 etc.). Reject the null hypothesis if, Fcritical < Fcalulated

For the given e x a m p l e :

Fcritical = 3 . 7 5 4 for the u = 0 . 0 5 , level of significance, and degrees of freedom.

(c - 1) = 2 and (N - c) = 14

ssbetwee,/ 10.2/
/(c - 1) _ 72 _
Fca1wate0 = SS-+-
- - 49 Yi -1.4338
within

(N - c) 14

Since, Fcr1t1ca1 > Fca1u1ated, the null hypothesis may be accepted.

Step 5: Interpret the Results

The independent variable does not have a significant effect on the dependent

variable, if the n u l l hypothesis of equal category means is not rejected.

On the other hand, the effect of the independent variable is significant, if the null

hypothesis is rejected.

A comparison of the category means will indicate the nature of the effect of the

i n d e p e n d e n t variable.

For the given example, the null hypothesis may be accepted. There is no significant

difference in the mean of the groups. The type of machine does not have significant

effect on the o u t p u t .

www.itmuniversityonline.org Page 135


Research Methodology

08. Analysis of Variance (ANOVA) eBook

8. 7 Two-way AN OVA

While doing research, the researcher is often concerned with the effect of more than one

factor simultaneously. In two-way ANOVA, the influences of two factors are considered

simultaneously, with t h e i r respective categories on the dependent variable.

For example, the quality of fabric (high, medium, and low) interacts with price levels

(high, medium, and low) to influence a brand's sale. Here, the dependent variable is the

brand's sale and the independent variables are quality and pricing. Within two factors,

there are three levels of categories.

Depending u p o n the replication of data within the levels of the factor, two-way AN OVA is

classified into two types:

Two-factor ANOVA without replication, where each factor combination is

observed exactly once.

Two-factor ANOVA with replication, where each factor c o m b i n a t i o n is observed

'm' n u m b e r of times.

Two-way ANOVA Layout: Without Replication

The data format for two-way ANOVA without replication is shown in Table 8.7a.

Levels of Factor Levels of Factor B (Bj)


Row
A
Mean
B1 B2 ... Be
(A;)

A1 Yu Y12 ... y le Y,.

-
A2 Y21 Y22 ... Y2, Y,.

... ... ... ... .. . .. .

A, Y rl Y,2 ... Yrc \

- -
Column Mean Y_, Y_, ...
Y.c

Table 8 . 7 a : Two-way ANOVA without Replication

i = 1, 2, 3 r, represents the different categories of factor A

j = 1, 2, 3 c, represents the different categories of factor B

www.itmuniversityonline.org Page 136


Research Methodology

08. Analysis of Variance (ANOVA) eBook

Analysis of Variance Table for Two-way ANOVA

ANOVA table can be setup in the usual fashion, as shown in Table 8 . 7 b .

Source Degrees
F-test
of Sum of Squares of Mean Square
Ratio
Variation Freedom

r
(r - 1 )
MS = SSA F - MSA
Factor A SSA = CL (Y, - Y)'
A - MS
i=l
A (r - 1)
E

MS = SSB le - MSB
Factor B S S A = rL (Yi - Y)2 (c - 1)
B - MS
B (C - 1)
j=l E

r c
2 MS = SS,
Error SS,= L L(Y
IJ
- Y - Y.J. + Y )
I.

i=l j=l (r-l)(c-1) ' (c-l)(r-1)

r c

Total ss, = L L C\i - Yl'


1=1 J=1
(re - 1)

Table 8 . 7 b : Analysis of Variance Table for Two-way ANOVA

www.itmuniversityonline.org Page 137


Research Methodology

08. Analysis of Variance (ANOVA) eBook

8.8 A n a l y s i s of Covariance (ANOCOVA)

W h i l e studying differences in the mean values of the dependent variables related to the

effect of the controlled independent variables, it is often necessary to take into account

the influence of uncontrolled independent variables.

Using Analysis of Covariance Technique (ANOCOVA), the influence of uncontrolled

variables is usually removed by simple linear regression method and the residual sums

of squares are used to provide variance estimates, which, in turn, are used to make

tests of significance.

,z

I x---.v'"r

Consider the influence of variable X (IV) on variable Y (DV) and also, the influence of

uncontrolled variable, z, which is correlated to the variable Y.

Covariance analysis consists of:

Subtracting each individual score (Y;) from correction factor of Y(Yi), that is

predictable from the uncontrolled variable ( Z ; ) .

Computing the usual analysis of variance on the resulting (Y - Y' ) .

ANOCOVA: Assumptions

Assume that there is some sort of relationship between the dependent variable

and the uncontrolled variable.

Various treatment g r o u p s are selected at random from the p o p u l a t i o n .

The g r o u p s are homogeneous in variability.

The regression is linear and is same from group-to-group.

Source: C. R. Kothari, Research Methodology-Methods and Techniques, Second Revised Edition, New Age

International Publishers, 2004.

www.itmuniversityonline.org Page 138


Research Methodology

08. Analysis of Variance (ANOVA) eBook

8.9 Chapter S u m m a ry

ANO VA is a data analyzing technique to determine differences between the means

of more t h a n two samples.

It considers that total variation is due to the variation in treatment and the

variation that is u n e x p l a i n e d .

In one-way ANOVA, only one factor of influence is considered for study, in two

way ANOVA, the influences of two factors are considered simultaneously.

Two estimates are compared with the F-test for the g i v e n degrees of freedom and

level of significance. The 'F' statistic in the ANOVA is as g i v e n below:

F = Estimate of p o p u l a t i o n variance based on between sa m p l e variance

Estimate of population variance based on within sample variance

In a na l y s i s of covariance, the effect of uncontrolled variables associated with

dependent variables is analyzed by the regression method.

www.itmuniversityonline.org Page 139


Non-parametric

Testing a n d

C h i - s q u a r e Test
R,esearch Methodology

09. Non-parametric Testing and Chi-square Test eBook

9 . 1 Introduction

There are various statistical techniques that are based on assumptions about the

p o p u l a t i o n from w h i c h the samples are drawn, For example, a s s u m i n g that the sa m p l e is

drawn from a normally-distributed population, All such techniques fall under parametric

tests but in situations where there are no rigid assumptions about the population, a

parametric test is not a p p l i ca b l e . Thus, for such data, non-parametric tests are the only

choice because they make no assumptions regarding the population and their

parameters,

In t h i s chapter, different non-parametric tests will be d i sc u s s ed , Also c h i - sq u a r e test, a

measure of non-parametric test, is discussed, Chi-square test is a widely used test for

inference in almost a l l fields of research.

After reading t h i s chapter, you will be able to:

Describe non-parametric tests

State the advantages and disadvantages of non-parametric tests over parametric

tests

Discuss C h i - sq u a r e test

E x p l a i n s i g n test and run test

E x p l a i n Spearman's rank correlation coefficient and Kendall's coefficient

Discuss Wilcoxon matched-pairs test

vivtv. i t rn u n i v e r s i t y o n l i n e . o r q Page 141


R_esearch Methodology

09. N o n - p a r a m e t r i c Testing and Chi-square Test eBook

9 . 2 N o n - p a r a m e t r i c Test

Definition

The non-parametric test is for inferences which do not need a n y a s s u m p t i o n s about the

distribution of variables. Thus, non-parametric test is also known as 'distribution-free

test'. For example, if the sales of two sports goods' brands are to be compared and

there is no assumption about the distribution of the two variables (both brands), then

you w i l l be u s i n g a non-parametric test for the inferences.

According to Gibbons, "a statistical technique is said to be non-parametric if it

satisfies one of the following five criteria:

1. The data are count data of number of observations in each category

2. The data are nominal scale data

3. The data are ordinal scale data

4. The inferences does not concern a parameter

5. The assumptions are general rather than specified."

In a parametric test, assumptions about the distribution of variables are necessary, so

its result is more reliable than a non-parametric test. Thus, parametric and non

parametric tests differ from each other. Table 9.2a shows the advantages and

disadvantages of non-parametric test over parametric test.

Advantages Disadvantages

1. Non-parametric methods are 1. Non-parametric tests can be used

readily comprehensible, very simple, only if the measurements are n o m i n a l or

easy to a p p l y , and do not require a ordinal. Even in that case, if a

complicated sample theory. parametric test exists, it is more

powerful than the non-parametric test.

In other words, if all the a s s u m p t i o n s of

a statistical model are satisfied by the

data and if the measurements are of

required strength, then the non-

parametric tests are a waste of time a n d

data.

2. No a s s u m p t i o n is made about the 2. So far, no non-parametric methods

form or the frequency function of exist for testing interactions in an

the parent p o p u l a t i o n from which 'analysis of variance' model, u n l e s s

www.itmuniversityonline.org Page 142


R,esearch Methodology

09. Non-parametric Testing and Chi-square Test eBook

the s a m p l i n g is d o n e , special assumptions about the a d d i t i v i t y

of the model are made,

3, No parametric technique will 3, Non-parametric tests are d e s i g n e d to

a p p l y to the data w h i c h are mere test statistical hypotheses o n l y and not

classification (that is, which are for estimating the parameters,

measured in nominal scale), while

non-parametric methods exist to

deal with such data.

4, Since the socio-economic data are

not, in general, normally-distributed,

non-parametric tests have found

a p p l i c a t i o n s in psychometrical,

sociological, and educational

statistics,

5, Non-parametric tests are

available to deal with data that are

given in ranks or whose seemingly

numerical scores have the strength

of ranks. For instance, non-

parametric tests can be applied if

the scores are given in grades, such

+ - +
as: A , A , B, A, B , etc.

. .
Table 9 . 2 a : Advantages and Disadvantages of Non-parametric Test over

Parametric Test

Source: Gupta. S. C., Kapoor V. K., Fundamentals of Mathematical Statistics, Eleventh Edition, Sultan

Chand & Sons, New De l h i , 2002

In t h i s chapter, some of the non-parametric tests, like sign test, rank sum test, run test,

Kendall's test, and chi-square test are discussed,

Non-parametric tests are very easy to calculate and also, provide quick results, In a

situation where you have data that is not exact or have no information about the

p o p u l a t i o n d i s t r i b u t i o n from which it is taken, then the parametric test fails and the non

parametric test is the only savior.

vivtv. i t rn u n i v e r s i t y o n l i n e . o r q Page 143


R,esearch Methodology

09. Non-parametric Testing and Chi-square Test eBook

9 . 3 C h i - s q u a r e Test

2
The Chi-square test is symbolically represented as x , The chi-square test is used for

the following cases:

To test the significance of population variance or homogeneity

To test the goodness of fit

To test the significance of association between two attributes

Chi-square to Test the Significance of Population Variance

The chi-square test can be used to determine whether the population variance is

significant, that is, a sample is drawn from a population, which is normally distributed

2
with mean and variance cr

The nu II hypothesis i s :

, 2 2
Ho CT s = CT P

Where,

2
cr 5 = Variance of the sample

cr = Variance of the population

n = S a m p l e size

Then the test statistic is g i v e n a s :

x 2
=2(n-1)
cr>
p

The value obtained from the above formula is compared with the critical value of chi

square at a level of significance, x2


, If the calculated value is greater than the
a, n-1

tabulated value of chi-square, then the null hypothesis is rejected at level of

significance a,

Chi-square is also used as a non-parametric test when the assumptions about the

population from which the sample are drawn are not known, The chi-square for non

parametric test involves chi-square goodness of fit and chi-square to test the

significance of association between two attributes,

vsvtv. itmu niversityonline.orq Page 144


R_esearch Methodology

09. N o n - p a r a m e t r i c Testing and Chi-square Test eBook

Chi-square Goodness of Fit

This test measures the difference between the observed and the theoretical (expected)

frequencies. This mechanism was developed by Karl Pearson in 1900, who named it

Chi-square goodness of fit.

To test the n u l l hypothesis,

H0 : There is no d i ff e r e n c e between the expected and observed f r e q u e n c i e s

And the test statistics is given by:

X2 = I (o, - e , J ' x-,


e,

Where,

e, = Expected frequency

o, = Observed frequency

k N u m b e r of categories

If a sample is arranged in k categories and the observed frequency is given, then the

expected frequency is calculated by the formula:

e, = np., i = 1, 2, 3 . . . k

Where,

p, = The p r o b a b i l i t y random variable X falls in the i'h category

n = S a m p l e size

Thus, comparing the calculated and the critical value at a specified level of significance

H is accepted or rejected, accordingly.


0

Example 0 1 :

If two coins are tossed 60 times, then the number of heads is g i v e n as shown in Table

9.3a.

Number of Heads 0 1 2

Frequency 15 25 20

Probability 0.25 0.5 0.25

Table 9 . 3 a : The Frequency Distribution of N u m b e r of Heads

www.itmuniversityonline.org Page 145


R_esearch Methodology

09. N o n - p a r a m e t r i c Testing and Chi-square Test eBook

Solution O 1 :

2
(e, - e,)2
x P, 0, e, oi-ei (o, - e;}
e,

0 0.25 15 15 0 0 0

1 0.5 25 30 -5 25 25 5
- = -
30 6

2 0.25 20 15 5 25 25 5
- = -

15 3

15
-
6

Table 9 . 3 b : The Frequency Distribution of N u m b e r of Heads

(o, - e.)? 15
2
Therefore, X = L ' ' = - = 2.5
e; 6

Thus, the critical value of x;_, at 510 level of significance is 5.991. Since, calculated

x 2
< x;_,, therefore, we accept the null hypothesis at 510 level of significance and

conclude that there is no difference in the observed and expected frequencies.

Chi-square to Test the Significance of Association Between Two Attributes

If two attributes, A and B, are given, which are divided into 'r' and 's' sub-categories,

such that they are arranged in a r x s contingency table, as shown in Table 9.3c.

A, A2 ... A, Total

B, (A,B,) (A 2 B,) ... (A,B,) (B,)

B2 (A, 8 2 ) (A2B2) ... (A,8 2 ) (82)

B, (A,B,) (A 2 B,) ... (A,B,) (B.)

Total (A,) (A2J ... (A,) N

Table 9 . 3 c : r x s Contingency Table

From Table 9.3c, LA; = LBJ = N

www.itmuniversityonline.org Page 146


R,esearch Methodology

09. Non-parametric Testing and Chi-square Test eBook

Here, the n u l l hypothesis is H : There is no difference between two a t t r i b u t e s


0

S i m i l a r l y , the expected frequencies are obtained, as in, c h i - sq u a r e goodness of fit test.

The expected frequency for (A ) = [Total of i"' column x Total of j"'row]


6
' ' Grand total (N)

And the test statistic is g i v e n a s :

X' = I (o, - e,)'

e,

Where,

o, = Observed frequency

e, = Expected frequency

F in ally , the critical value is obtained in order to test the significance of the null

hypothesis at a level of significance for (r - l ) ( s - 1) degrees of freedom,

Example 02:

Table 9.3d shows the distribution of sales of two apparel brands in showrooms,

s h o p p i n g centers, and online s h o p p i n g .

Sales from

Brands Shopping Online Total


Showrooms
Centers Shopping

A 120 100 60 280

B 180 60 80 320

Total 300 160 140 600

Table 9 . 3 d : 2 x 3 Contingency Table

Test the significance of both the brands that are equally referred at a 5% level of

significance,

vivtv. i t rn u n i v e r s i t y o n l i n e . o r q Page 147


R.esearch Methodology

09. Non-parametric Testing and Chi-square Test eBook

Solution O 2:

The null hypothesis is H : There is no difference between sales of the two brands in the
0

g i v e n outlets.

The expected frequencies of brand A are:

280 300
Sales in showroom = x = 140
600

280 160
Sales in s h o p p i n g centre = x = 74.667
600

280 140
Sales in on line s h o p p i n g = x = 65.333
600

S i m i l a r l y , for brand B:

320 300
Sales in showroom = x = 160
600

320 160
Sales in s h o p p i n g centre = x = 85.333
600

320 140
Sales in on line s h o p p i n g = x = 74.667
600

Therefore, value of x 2
statistic is calculated, as shown in Table 9. 3e .

Observed Expected
2
2
(o - e)
Brands Frequency Frequency (o-e) (o- e)
e
(o) (e)

Showrooms 120 140 -20 400 2.857

A Shopping Centers 100 74.667 25.333 641. 761 8.595

O n l i n e Shopping 60 65.333 - 5.333 28.441 0.435

Showrooms 180 160 20 400 2.5

B Shopping Centers 60 85.333 - 25.333 641. 761 7.521

O n l i n e Shopping 80 74.667 5.333 28.441 0.381

22.289
.
Table 9 .3 e : 2 x 3 Contingency Table

Degree of freedom is (r - l)(s - 1) = (2 - 1 ) ( 3 - 1 ) = 2

www.itmuniversityonline.org Page 148


R_esearch Methodology

09. N o n - p a r a m e t r i c Testing and Chi-square Test eBook

Therefore, the critical value of chi-square for 2 degrees of freedom at 5% level of

significance is 5.99L Since the critical value is less than the calculated value of chi

square, we reject the n u l l hypothesis at 5/o level of significance.

Characteristics of Chi-square Test

The m a i n characteristics of chi-square test are:

Since no a s s u m p t i o n s are available about the population, the test is not based on

parameters, such as mean, standard deviation, etc but it is based on the

frequencies.

It is not a p p l i c a b l e for estimation. It is only used for testing hypotheses.

It follows the additive property.

It is useful for complex contingency tables.

Conditions for the Application of Chi-square Test

Some conditions where the chi-square test can be applied are:

Observations recorded and used are collected on a random basis.

A l l the items in the sample must be independent.

No group should contain less than 10 items. In cases where the frequencies are

less than 10, regrouping is done by combining the frequencies of a d j o i n i n g groups,

so that the new frequencies become greater than 10.

Some statisticians t h i n k this number is 5 but most believe 10 is better.

The overall n u m b e r of items must also be reasonably large. It should normally be

at least SO, howsoever small the number of groups may be.

The constraints must be linear. Constraints which involve linear equations in the

cell frequencies of a contingency table (equations containing no squares or higher

powers of frequency) are known as linear constraints.

Source: Kothari, Research Methodology, 2002, New Age International, New Delhi.

9 . 4 S i g n Test

The easiest and simplest of all non-parametric tests is the sign test. In this test, the

direction of the observations, that is, positive or negative direction, which are denoted

by '+' and '-' signs, are considered instead of the magnitude. There are two types of

s i g n test:

One sample s i g n test

Two sa m p l e s i g n test

www.itmuniversityonline.org Page 149


R,esearch Methodology

09. Non-parametric Testing and Chi-square Test eBook

One Sample Sign Test

If a data X
1
, X , , X, " , X" is given with the sample median 8 , T h i s test is used when you

want to test the s a m p l e median with a specific mean value (9 ),


0

And it is assumed that, P(X < 8) = P(X > 8) = .!


2

Then, the n u l l hypothesis i s :

Ho : = o

If the value of the sample observation is greater than 80, then the values are replaced

by a positive sign '+' otherwise by a negative sign '-', However, if the values of the

sample observation are equal t o 8 0 , then it is ignored,

The total numbers of'+' signs (r) and the total numbers of'-' signs (s) is such that

r + s n

Thus, in order to test the hypothesis here, r is considered to follow binomial distribution

with p = .!. , Then, the hypotheses are stated as:


2

1 1 1
H0 : p = - and H1 : p " - or p < -
2 2 2

However, for a large sample size, normal approximation to binomial d i s t r i b u t i o n is used,

Example 0 3 :

Consider the n u m b e r of pages printed by 12 printing machines, in a printing press is as

given: 320, 370, 4 3 0 , 320, 350, 3 1 0 , 390, 380, 360, 320, 400, and 320,

Using s i g n test at 5% level of significance, test that the average pages printed are 3 8 5 ,

Solution 0 3 :

The n u l l hypothesis is H
0
: = 385

The values of the g i v e n data are replaced by positive s i g n ' + ' and with a negative s i g n '

- ' as below:

-, -, +, -, -, -, +, -, -, -, + and - .

Thus, the modified data follows binomial distribution and the null hypothesis is

1
Ho : p = -

vsvtv. itmu niversityonline.orq Page 150


R_esearch Methodology

09. Non-parametric Testing and Chi-square Test eBook

And n = 12, r = 3, s = 9, and p = .!:. , s o tabulated value at 5/o level of significance from
2

the b i n o m i a l table is 2 . 6 8 , which is greater than a = 0.05.

Therefore, the null hypothesis is accepted at 5/o level of significance and we can

conclude that the average pages printed are 385.

Two Sample Sign Test

This test is used to determine if two samples are drawn from an identical population.

Thus, if there are two samples, then the sign test for two samples is a p p l i c a b l e .

In t h i s method, each pair of values is replaced with a positive sign ' + ' if the value of the

first sample is greater than the value of the second sample. Otherwise, it is replaced by

a negative s i g n ' - ' . If the values are equal, it is ignored.

9 . 5 R u n Test

The word 'run' here denotes the sequence or series of symbols that are followed or

preceded by a different symbol or no symbol. A run test is used to verify whether there

is any randomness among the observations in a given data. Thus, a run test helps you

to find out if the sample is randomly selected from the population.

For example, before launching a new health drink in the market, the manager of the

company wants to conduct a survey to determine which age group would be the

preferable target group for the product. Customers in the age group < 25 years are

denoted by T and those > 25 years are denoted by 0. The manager puts up a counter

for the customers to taste the health drink and feedback is taken.

Here, the n u l l hypothesis i s :

H : The customers in age group < 25 and > 25 visiting the counter are random.
0

If the sequence representing the type of customers coming to counter is g i v e n a s :

T T O O T T T T T T T T O O O O O O T T T T

1 2 3 4 5

Thus, in the above representation, there are 5 total number of runs (r) in which 14 are

of ages < 2 5 and 8 are of ages > 2 5 , that is, n1 = 14 and n2 = 8.

www.itmuniversityonline.org Page 151


R.esearch Methodology

09. Non-parametric Testing and Chi-square Test eBook

Thus, for small samples, when the sample size is less than 20, then the lower () and

upper ( r , ) critical values at a specific significant level can be obtained from the table for

run tests. If r , ,; r s r,, then the null hypothesis is accepted.

If the sample size is greater than 20, then the sampling distribution of 'r' tends to

2
normal d i s t r i b u t i o n with mean ()and variance ( cr ).

Where,

2n 1 n 2 +l
=---
nl + n2

2
2n n (2n n
- n, - n,)
1 2 1 2
(J = --_c__c-

(n, + n 2)2 (n 1 + n, - 1)

To test 'r', the following standard normal statistics are obtained as:

Z = r -

CT

If the calculated value of Z lies between the tabulated values - Ziz and Z ';/,_ , then the

null hypothesis is either accepted or rejected.

Example 04:

If a d i e is thrown 20 times, you need to test whether the occurrence of an even number

( E ) and an odd number (0) is random or not.

E E E E E E E O O O E E O O O O E E E E

1st

Use a = 0.05.

Solution 04:

In the g i v e n sequence,

n, = N u m b e r occurence for even n u m b e r s = 13

n = N u m b e r occurence for odd n u m b e r s = 7


2

r = N u m b e r of r u n s = 5

Here, H0: The events are random

The lower (r,) and upper (r,) critical values of r a t 5/o level of significance for given

n, = 13, n, = 7 are 5 and 15, respectively. As a result, r, ,; r ,; r,, so we accept the null

www.itmuniversityonline.org Page 152


R,esearch Methodology

09. Non-parametric Testing and Chi-square Test eBook

hypothesis and state that the occurrence of even and odd numbers in an experiment

where the d i e is thrown 20 times, is random at 5/o level of significance,

9 . 6 S p e a r m a n ' s Rank Correlation

When the data values are not numerically measurable but can be ranked, a c c o r d i n g l y , In

such a situation, a rank correlation coefficient is used, that is, it is used to measure the

association between the variables. It has been formulated by Charles Edward Spe ar man

in 1906, Thus, it is known as Spearman's rank correlation coefficient and is denoted by

p,

The formula for obtaining the rank correlation is:

6Ld }
p - 1 - '
{ n(n2 - 1)

Where,

d, = Difference between ranks = (R1 - R2)

R1 = Ranks assigned to values of the first variable

R2 = Ranks assigned to values of the second variable

2
I d 1
= S u m of the squares of difference between ranks

n = N u m b e r of paired observations

Here, the null hypothesis H0 : The variables are independent or there is no correlation

between the v a r i a b l e s .

Against the alternative hypothesis,

H : The variables are dependent or there is a correlation between the variables,


1

For sample sizes less than 30, if the critical value is greater than the tabulated value of

the test statistic, then the n u l l hypothesis is either accepted or rejected,

For sample sizes greater than 30, the sample distribution is assumed to follow normal

d i s t r i b u t i o n , with mean zero and standard deviation, Y.J n - t '

That is, standard error, a, = 1/c=:


I ,in - 1

The table for normal curve is used for critical value,

vivtv. i t rn u n i v e r s i t y o n l i n e . o r q Page 153


R_esearch Methodology

09. Non-parametric Testing and Chi-square Test eBook

Note:

It must be noted that, if the ranks of two or more values are equal, then the average

v a l u e of the ranks that would have been assigned to the values if they were different, is

assigned to those values. So, the formula for the statistic is adjusted by the term

(m m), where m denotes the number of observations involved in a tie in any of the
1;

variables u n d e r study.

6Ld + L (m -m))
P l _ 12
2
n(n - 1)

Where, L (m m) the summation stands for the number of tied ranks.


1;

Example 0 5 :

Table 9.6a d i s plays the values of two variables X and Y. Test whether the variables are

i n d e p e n d e n t or not at 5% level of significance.

x y

101 120

111 125

102 123

105 121

112 122

109 126

Table 9 . 6 a : Distribution of X and Y

Solution 0 5 :

For the above problem, the null hypothesis is:

H : The variables X and Y are independent.


0

H, : The variables X and Y are dependent.

Here, Spearman's rank correlation coefficient is used, which is g i v e n a s :

6Ld }
p - 1 - '
{ n(n2 - 1)

www.itmuniversityonline.org Page 154


R,esearch Methodology

09. Non-parametric Testing and Chi-square Test eBook

Table 9 . 6 b is constructed to obtain the calculated value of p .

x y R1 R, d, d. 2

'

101 120 1 1 0 0

111 125 5 5 0 0

102 123 2 4 - 2 4

1 05 121 3 2 1 1

112 122 6 3 3 9

109 126 4 6 - 2 4

Total 18

Table 9 . 6 b : Distribution of X and Y

Therefore, p = 1 - { nJ l)} = 1 - { : 81)}


6

=1-{108}
210

10 2
=
210

= 0.486

Here, for n = 6, at 5% level of significance, the tabulated value of p is 0 , 8 8 6 , Since, it is

greater than the calculated value, we accept the null hypothesis, Thus, the variable X

and Y are i n d e p e n d e n t .

9 . 7 K e n d a l l ' s Test

This test is an important non-parametric test for measuring the significant relationship

between two v a r i a b l e s , When two variables are tested for the association between them,

either Spearman's rank correlation coefficient is used or Kendall's coefficient is a p p l i e d ,

The Kendall's test is also termed as Kendall's Coefficient of Concordance (W),

Here, the hypotheses are stated as:

H : The variables X and Y are independent.


0

H, : The variables X and Y are dependent.

vivtv. i t rn u n i v e r s i t y o n l i n e . o r q Page 155


R,esearch Methodology

09. Non-parametric Testing and Chi-square Test eBook

The a s s u m p t i o n s for a p p l y i n g Kendal's test are:

The r a n k i n g is g i v e n independently

The data is o r d i n a l in nature

The procedure for computing and interpreting Kendall's coefficient of concordance (W)

is:

L All the objects, N, should be ranked by all k judges in the usual fashion and this

information may be put in the form of a k by N matrix,

2, For each object, determine the sum of ranks ( R i ) assigned by a l l the k j u d g e s (j =

= 1, 2, 3 . . . k).

3, Determine Ri (mean for R i ), and obtain the value of s as g i v e n :

4, Work out the value of W, using the following formula:

W = s
2 3
_1_ k (N - N)
12

Where,

S=L(RJ-RJJ

N = N u m b e r of objects ranked

Source: Kothari, Research Methodology-Methods and Techniques, New Age International Publishers, New

Delhi, 2 0 0 2

Note:

If there are tied ranks in the data, then the above formula is modified a s :

W = s
2 3
_1_ k (N - N) - kL T
12

A correction factor 'T' is calculated for each of the k sets of ranks and these are added

together over the k sets to obtain LT.

L(t'-t)
T = and the summation depends on the number of tied ranks,
12

vsvtv. itmu niversityonline.orq Page 156


R_esearch Methodology

09. Non-parametric Testing and Chi-square Test eBook

Example 0 6 :

The ranks obtained by 5 candid ates from 4 interviews that were conducted for the post

of a CA are as g i v e n in Table 9.7a.

A B c D

1 1 1 5 1

2 3 5 3 4

3 4 4 2 2

4 2 2 4 5

5 5 3 1 3

Table 9 . 7 a : Distribution Ranks given by the 4 Judges

Test at 5% level of significance that the ranks assigned by j u d g e s are different.

Solution 0 6 :

For t h i s , we construct the Table 9 . 7 b .

Sum of ranks
A B c D s = (Ri - R i f
(R;)
I

1 1 1 5 1 8 16

2 3 5 3 4 15 9

3 4 4 2 2 12 0

4 2 2 4 5 13 1

5 5 3 1 3 12 0

60 26

Table 9 . 7 b : Distribution Ranks Given By the 4 Judges

Here, the hypotheses are stated as:

H : The variables X and Y are independent.


0

H, : The variables X and Y are dependent.

www.itmuniversityonline.org Page 157


R,esearch Methodology

09. Non-parametric Testing and Chi-square Test eBook

s
Therefore, W=----

J:_ k 2 ( N 3 - N )
12

26
=----
J:_ 4 2 ( 5 3 - 5 )
12

= 0, 1625

The calculated value of W is 26 and the tabulated value for k = 4 and N = 5 (using

Kendall's table) is 8 8 A , which is greater than the calculated value,

So, we accept the null hypothesis and conclude that the judges' ranking is insignificant

at 5% level of significance,

Relationship between Spearman's Correlation Coefficient and Kendall's

Coefficient

W is an appropriate measure for studying the degree of association among three or

more sets of ranks but you can also determine the degree association among k sets of

ranking by averaging the Spearman's correlation coefficients ( p) between all possible

pairs k(k - l) of r a n k i n g in view that W bears a linear relation to the average (p) taken
2

over a l l possible pairs. The relationship between the average of p and Kendall's W can

be put in the following form:

Average of p = (kW - %- l)

However, the method of finding W, using average p between all possible pairs is quite

tedious, particularly when k happens to be a big figure and, as such, this method is

rarely used in practice for finding W (Kothari, Research Methodology, 2 0 0 2 ) ,

vsvtv. itmu niversityonline.orq Page 158


R,esearch Methodology

09. Non-parametric Testing and Chi-square Test eBook

9.8 Wilcoxon M a t c h e d - p a i r s Test

Wilcoxon matched-pairs test is used in case of paired data, If the data for paired

samples is given like the values before and after a medical treatment, the supply and

demand of a commodity in the market, etc., then the Wilcoxon test, among all non

parametric tests, is the most suitable test.

If X and Y are two paired data of small sample sizes, then the difference between the

values of pairs of variables is obtained and is denoted by d . , that is, d, = X, - Y , ,

Then, the ranks are assigned to the differences by ignoring the + and - sign and also

ignoring the differences with value equal to zero, The next step is to calculate the sum

of a l l ranks with a positive sign (T), with a negative sign ( T - ) , and then obtain Min (T+

'T-),

The n u l l hypothesis is stated a s :

H : There is no difference between the two samples,


0

H, : There is difference between the two samples,

F in ally , the critical value is obtained at a specified level of significance, Thus, if the

calculated value is less than the critical value, then the null hypothesis is either accepted

or rejected.

Example 0 7 :

Using Wilcoxon m a t c h ed - p a i r s test, test whether the two samples are significantly

different at 5% level of significance,

First Sample 10 21 15 14 11 16 13 16

Second Sample 9 25 13 15 17 16 10 11

. . .
Table 9.Sa: Distribution of Values of Two Samples X and Y

vivtv. i t rn u n i v e r s i t y o n l i n e . o r q Page 159


R,esearch Methodology

09. Non-parametric Testing and Chi-square Test eBook

Solution 0 7 :

H : There is no difference between the two samples,


0

To calculate the Wilcoxon matched-pairs test, we prepare the Table 9.Sb,

X, Y, d, ld,I Ranks

10 9 1 1 LS

21 25 -4 4 5

15 13 2 2 3

14 15 -1 1 LS

11 17 -6 6 7

16 16 0 0 -

13 10 3 3 4

16 11 5 5 6

Table 9 . S b : Distribution of Values of Two Samples X and Y

Here, T- = 1 3 . 5 and T = 14. 5

Therefore, min ( T, T-) is 1 3 . 5

The tabulated value of T at 510 level of significance in case of a two-tailed test is 2,

which is less than the calculated value of T. Thus, we reject the null hypothesis and

conclude that there is a difference between the samples at 510 level of significance.

9 . 9 M a n n - W h i t n e y U Test

Mann-Whitney U test is used to find out whether the two given samples that are drawn

from the two populations are identical or not. Here, the null hypothesis states that the

two samples are drawn from different populations having the same d i s t r i b u t i o n ,

For example, there are two samples of sizes n, and n


2
, such that N = n, + n,

The ranks are assigned to the two samples separately and if two values are s i m i l a r , then

the average of the ranks that would be assigned to the two values if they were different,

are assigned to the two numbers. Then, the ranks of the two sa m p l e s are summed

separately and are denoted by R,and R2, respectively,

vsvtv. itmu niversityonline.orq Page 160


R_esearch Methodology

09. Non-parametric Testing and Chi-square Test eBook

For a s m a l l sample ( n,, n is large than 8 ), the test statistic is given a s :


2

U n, (n, + 1)
, = n, n, + 2

U _ n1 (n1 + 1) R
2 - n1 n2 + - 2
2

Then, obtain min(U,, U ) and compare the obtained value with the tabulated value of U
2

for (n,, n,) degrees of freedom. If the tabulated value is greater than the calculated

value, the hypothesis is accepted.

For a larger sa m p l e ( n , , n, is between 9 a n d 20 ), the test statistic is g i v e n a s :

U _ n, (n1 + 1) R
- n1n2 + 2 - t

Where,

2
U N(u, ou )

2
With mean = n,n, and variance a = n,n,(n, + n, + l)
u 2 u 12

Therefore, the test statistic i s :

U -
z = u
u

Thus, if the first population > the second population, then for Zu < -Za, reject H0 If

the first population < the second population, then for calculated Zu > Za, reject H0 If

the first population and the second population differ from each other, the calculated

Zu < -Za/2 or Zu > Za/2, then reject H0.

www.itmuniversityonline.org Page 161


R,esearch Methodology

09. Non-parametric Testing and Chi-square Test eBook

9 . 1 0 Chapter S u m m a r y

Non-parametric tests are for inferences that do not require a n y a s s u m p t i o n s about

the p o p u l a t i o n in the study,

2
The c h i - sq u a r e ( x ) test is used for the following cases:

o To test the significance of population variance or homogeneity

o To test the goodness of fit

o To test the significance of association between two attributes

S i g n test is a non-parametric test in which only direction that is denoted b y ' + ' or

'-' sign are considered, while the magnitude is not considered, There are two

types of s i g n test:

o One sample sign test

o Two sample s i g n test

A run test is used to find out whether the sample is random or not, This test is

based on runs, that is, a sequence or series of symbols, which is followed or

preceded by a different symbol or no symbol-

Spearman's correlation coefficient is used to determine whether the two variables

are i n d e p e n d e n t or not.

Kendall's coefficient is the technique to verify if there is any association between

more than two variables.

The Wilcoxon matched-pairs test is used in case of paired data,

vsvtv. itmu niversityonline.orq Page 162


Research Report

Writing
Research Methodology

10. Research Report Writing eBook

1 0 . 1 Introduction

In the previous chapters, the topics necessary for a research were discussed. In this

chapter, you will get a brief idea of how to pen-down the research, in order to get a

proper research report. Right from data collection to analysis and interpretation of the

data, the whole process of research is systematically represented in a research report.

Research report w r i t i n g is the final stage of a research. To write a good, effective, and

detailed research report is very subjective in the sense that it varies according to the

perspective, experience, and research of different researchers.

Lately, research is being widely carried out in different fields of physical and social

sciences, so as to convey the results of these researches, proper reports should be

written. Thus, a research report contains all the evidence and valid references to support

its interpretation. Hence, research report writing is the most important part of any

research.

After reading t h i s chapter, you will be able to:

Define research report

State the importance of research report

List the steps in writing research report

State the different parts of a research report

Describe the preliminary part of a research report

Describe the m a i n body of a research report

Discuss the different types of research report

www.itmuniversityonline.org Page 164


Research Methodology

10. Research Report Writing eBook

1 0 . 2 M e a n i n g a n d Importance of Research Report

Meaning

According to Z i k m u n d , "A research report is an oral presentation a n d / o r written

statement whose purpose is to communicate research results, strategic

recommendations, and/or other conclusions to management or other specific

audiences."

Thus, a research report is an essential part of research, without which the research is

incomplete. It not only reflects information about the research f i n d i n g but also the whole

process of research is summarized. A report for a business firm is prepared, so as to

help in business decision making, as well as, to forecast the measures to be taken w h i c h

w i l l lead to the growth of the business.

A research proposal reflects a systematic order of the steps of a research work. It is

defined by Ranjit Kumar as, "an overall plan, scheme, structure and strategy

deigned to obtain answers to the research question or problems that constitute

your research project, its main function is to detail the operational plan for

obtaining answers to your research questions."

In fields like finance, marketing, sales, human resources, mathematics, statistics,

medical, biotechnology, social sciences, etc., research is carried out immensely but the

presentation of research reports differ according to the subject. However, for

standardization of these reports in different fields, a set of guidelines and format is

followed, in order to obtain consistency in the reports. In this chapter, the standard

format for a research report, that is, the main sections to be included in a report are

discussed.

For example, the Ph.D. thesis is a research report, where the whole process is written to

be communicated to the readers after the approval of its g u i d e . Its e m p h a s i s is more in

some u n i q u e f i n d i n g s , in their respective fields.

www.itmuniversityonline.org Page 165


Research Methodology

10. Research Report Writing eBook

Importance

The importance of a research report is as follows:

It communicates the research work, as a medium of c o m m u n i c a t i o n .

M a n a g e m e n t decision making are aided by such reports.

It also helps in planning s c h e m e s and strategies for future, based on its result.

It can be used for future references in any research with relation to it, for a more

advanced study of the fin d in g.

Characteristics of a Good Report

A good report is one which is clear, easily understood, and precise. To distinguish a

report as a good one, the following characteristics are to be satisfied:

It should be properly presented.

It shou Id be attractive in its appearance.

The chapters and sections should be well organized.

It should be in a simple language so that it is easily understood by its

readers/audience.

The facts mentioned in the report should be scientifically verified. Also, the data

should be checked, so that there are no issues of validity and r e l i a b i l i t y present.

Data should be collected in a practical way.

It should highlight the difficulties faced in data collection and not only the

achievements in its success.

A report having the above characteristics is attractive in its approach and gets more

au d ience or readers.

www.itmuniversityonline.org Page 166


Research Methodology

10. Research Report Writing eBook

1 0 . 3 Steps i n Writing a Research Report

Step 1 . Logical Analysis of Research Question under Study

Step 2 . Preparation of the Outline

Step 3. Preparation of the Rough Draft

Step 4. Preparation of the Bibliography

Step 5.

Rewriting and Refining the Rough Draft

Step 6. Final Draft Writing

Fig. 1 0 . 3 a : Steps to Writing a Research Report

1. Logical Analysis of Research Question under Study

The first step in writing a research report is the analysis of the research q u e s t i o n ,

where two aspects are to be considered:

L og i c al: Analyze all logical associations and relation between the research

question u n d e r study and other studies, published papers, etc.

Chronological: Analyze all chronological evidences relating to the research

question u n d e r consideration.

2. Preparation of the Outline

The next step is to prepare an outline of the research work. By doing this, the

research can be framed in a systematic order and also, one can list out all the

important points to be considered during the research.

www.itmuniversityonline.org Page 167


Research Methodology

10. Research Report Writing eBook

3. Preparation of the Rough Draft

After the outline of the research is prepared, the next step is to prepare a rough

draft. The rough draft will consist of the procedure for data collection and the

l i m i t a t i o n s faced in the collection, statistical tools for analysis, generalization of the

resu Its, etc.

4. Rewriting and Refining the Rough Draft

In t h i s step, a l l the limitations present in the rough draft of the research report are

rectified and a more refined draft is prepared.

5. Preparation of the Bibliography

Before the final step, you need to prepare the bibliography, which is the list of

books, j o u r n a l s , articles, etc. listed in a particular manner. The order in w h i c h the

books and pamphlets are to be listed is, name of the author (last name first), title,

place, publisher, date of publication, and volume number. The order in which

magazines and newspapers are listed is, name of the author (last name first), title

of article (in quotation marks), name of periodical, the volume number, date of

issue, and page n u m b e r .

6. Final Draft Writing

Lastly, the final draft of the research is prepared, where detailed information about

the research is g i v e n in simple language. It is a refined version of a l l the previous

drafts of the research, in order to get a polished and proper research report.

www.itmuniversityonline.org Page 168


Research Methodology

10. Research Report Writing eBook

1 0 . 4 Report Format

For an effective and valid research report, it is necessary to write the report in some

standard format that is universally accepted. A report can be divided into three parts:

preliminary parts, main body, and appended part. These three parts can be further

classified, as shown in Fig. 10.4a.

Research Report Parts

Preliminary Parts Main Body Appended Parts


I

Title Page Data Collection


Introduction
Forms

Letter of
Detailed
Authorization Methodology
Executive Summary Calculations

Table of Content Objectives Data Analysis General or

Results and Results Technical Tables

Conclusions

List of Figures, Recommendations Conclusions and


Tables, and Graphs Bibliography
Recommendations

Fig. 10.4a: Research Report Parts

A report should contain all the parts shown in Fig. 10.4a and in the same order, so as

to be accepted by its reader. Every report, irrespective of the subject, is suggested to

use t h i s standard format.

1 0 . 5 P r e l i m i n a r y Parts of Research Report

The p r e l i m i n a r y parts of research report consist of the following:

Title page

Letter of authorization

Table of content

List of figures, tables, and graphs

Title Page

The title page should express the title of the research, 'for whom' the report is prepared,

'by whom' it is prepared with the 'name of the institute/university/company', and the

www.itmuniversityonline.org Page 169


Research Methodology

10. Research Report Writing eBook

'date of its release'. The title of the research report should correctly and completely

depict the purpose of the research.

Letter of Authorization

For the validity and approval of the research, the letter of authorization from the

concerned authority is required. The letter approves the work done for the research,

highlighting the details of the data and its sources. Also, along with this letter of

authorization, a letter of transmittal is given, which indicates the release of the report to

its readers.

EMR ResearchGroup
MOW!g,ottbtw-Md!

August 30, 2009

Mr. Mario lagasto

Pres.ident, leading Edge Food Group

Columbia, IA 50057

Re: Presentation of Research Identifying Customer Loyalty

Dear Mr. Lagasto:

The report outlined in the research proposal of March 15, 2009, is complete. I have

personally supeMSed the project, conducted the statistical analyses, and prepared thts

report alooo with my two senior research assoaates.. NataHa James and David Parker.

The report addresses the key decision statement: In what WWfS can )'OUl" restaurants build

customer loyalty so that revenues increase through more frequent patronage? The key

research questions involve identifying controllable characteristics that end up relating to

greater share of wallet. As agreed upon in the pn>posal, the report offers no specific

recommendations for managerial action, but rather, it presents conclusions which shouk:I

emble you to make ilformed decisions. Thus, the conclusions conform to the

deliverables desaibed in the proposal letter.

We successfully accompflshed the research project as described in the outline. We were

able to meet OU( goais for interviewing groups of customers and non-customers in a

timely fashion. We are grateful for your business and k><>k forward to working with you

as you develop strateoic ptans of achon based on this report. Once you have taken a look

at the report, please contact me and we will schedule a formal presentation and

question and answer period for yotl' management team.

Sincerely,

Barry J. Babin

President

. R-,ch Grol.J,

11-4 Rlilto.i Aw

Chaud11r1L u. nm

Fig. 10.Sa: Letter of Authorization

Source: Business Research Methods, gm Edition, Zikmund, Babin, Carr, Griffin

www.itmuniversityonline.org Page 170


Research Methodology

10. Research Report Writing eBook

Thus, the letter of authority is a declaration given by the person who has verified the

whole study and declared the acceptability of the study, as in Fig. 10.Sa.

Table of Content

It is an essential part in any report. It is the list of all the topics covered with the topic

divisions and subdivisions along with its page references. The table of content is

prepared on the bases of the final draft of the research.

Table of Contents

CHAPTER 1 RESEARCH FUNDAMENTALS AND TERMINOLOGY14

DEFINITIONS OF BASIC RESEARCH TERMS 15

Reseilfc:h, ,- _,,,_,, - 15

ReseMc:h Methods 15

ReseMm Methodolog 17

Srientifi<; Methods 17

Research Process 19

Resevch Desig:n -.,-------------- .. -. 19

OBJECTIVES OF RESEARCH ...................................................................... 20

MOTIVATION IN RESEARCH 21

SIGNFICANCE OF RESEARCH IN GOVERNMENT, INDUSTRY

BUSINESS AND TRADE ............................................................................... 23

SCOPE OF RESEARCH INCLUDES THE FOLLOWS AREAS ................ 28

PRINCIPLES OF QUALITY RESEARCH WORK ...................................... 29

PROBLEMS/LIMITATIONS OF RESEARCH 31

ISSUES AND TRENDS IN RESEARCH ....................................................... 33

SUMMARY 35

REVIEW EXERCISES 35

FURTH ER READINGS .................................................................................. 3 7

CHAPTER 2 IMPORTANCE OF RESEARCH IN MANAGEMENT

DECISIONS 38

FUNDAMENTALS OF MANAGEMENT DECISIONS 39

Chi1r.1cteristics of M.1emen1 Decisions- -, - -.. , ,,--- .. - 39

Elements of Decision M.11ting - -, -.. , _.,, 40

TYPES OF MANAGEMENT DECISIONS 4-4

Plilnninc Decisions on Time Horizons .. -.. --- _,,_.,,_ .. ,_ .. ,- .. ,_ .. _ _ _,, 45

SbtK Plilnninc Decisions - 47

Dynilmic: Plilnning: Disions ---- .. -------- 47


Planning ullder Dynamic Collditions -, -,, 48

Planning intilncible Decisions -, SO

Control Disions-------- .. ---------- .. - 51

Prognimmed and Non-proerammed Decisions - - 52

Routine .11ld Stnteck; De<isions ----- - 51

Policy ilnd Str.1tecic Decisions ------ .. --------- .. -- S1

Oep.1rtmenhl illld Non-Economic Decisions 51

Fig. 1 0 . S b : Table of Content

Source: http: IIebookbrowsee. net/ research-methodology-self-learning- ma nua l-pdf-d 18416 6142

List of Figures, Tables, and Graphs

If there are many figures, graphs, and tables in support of the research, a list of the

name of the figures, graphs, and tables with page references is g i v e n after the table of

content in a research report.

www.itmuniversityonline.org Page 171


Research Methodology

10. Research Report Writing eBook

These sections are i n c l u d ed in the preliminary part of a research report. After t h i s , the

main body of the research report begins, explaining the whole procedure and technique

of the research. In addition to these sections in the preliminary part, it also consists of

an executive summary.

The Executive S u m m a r y

It is simply the summary of the whole report. It briefly explains all the four parts of a

research:

Objectives: It states all the important information and purpose of the research.

Results: It states the methodology and result of the research.

Conclusions: It states the interpretation of the results obtained and other

interpretations, based on the results.

Recommendations: It states any suggestions, based on the c o n c l u s i o n .

A sample executive summary has been given below:

Executive Summary

Uncertainty associated with changes in carbon stock is from two additive variance

components:

o Prediction error, a measure of possible bias in the allometric e q u a t i o n s used

to predict above ground tree carbon.

o Sampling error, a measure of variation recognizing that only a very small

proportion of Kyoto forest is actually surveyed.

The average prediction error is estimated to be around 1 %. This figure is likely to

be an underestimate, especially when estimating changes in carbon stock over

2008 - 2013. More biomass data are required to verify t h i s uncertainty. W h i l e we

show the effects of varying prediction error on total uncertainty, the confidence

intervals in t h i s report are calculated only in terms of s a m p l i n g error.

The estimates of carbon stock from the 2004 Nelson and Marlborough pilot data

are 64.4 12.6 t/ ha (95/o confidence interval). This estimate uses analytical

methods to calculate the uncertainty. Carbon stock is estimated from 104 plots for

six pools:

o Above-ground live planted trees

o Above-ground live other species (includes unplanted trees and shrubs)

o Below-ground live planted trees

o Below-ground live other species (includes unplanted trees only)

o Coarse woody debris

o Fine litter

www.itmuniversityonline.org Page 172


Research Methodology

10. Research Report Writing eBook

Estimates of change in carbon stock using C_Change to predict carbon for 2008

and 2013 are 55.0 10.3 t/ ha. The estimate of change in carbon is for four

pools:

o Above-ground live planted trees

o Below-ground live planted trees

o Coarse woody debris

o Fine litter

Uncertainty is expected to be reduced in a nationwide survey, with 200 sites the

confidence interval is estimated to be 4. 9 t/ ha. This estimate assumes that the

two surveys, 2008 and 2 0 1 3 are correlated w i t h ? = 0.90. There is some evidence

that the correlation could be as high as 0. 97 but may be less than this if there is

extra variation from genetic, silviculture and climatic factors. A conservative

approach should be adopted in choosing the final number of sites in the nationwide

survey to allow for future extra variation.

Estimates of uncertainty have been derived using analytical methods and it is not

necessary to use Monte Carlo simulation.

Extra error from area definition will inevitably increase uncertainty associated with

total carbon stocks. The estimated errors apply to carbon d e n s i t y (t / h a ) .

Source: http ://www. math .canterbury .ac.nz/research/ucdms2005n8.pdf

10.6 Main Body of Research Report

The m a i n body of a research report has the following sections:

Introduction

Methodology

Data a n a l y s i s and results

Conclusions and recommendations

Appended pa rt

1 0 . 6 . 1 Introduction

The first section of the main body is the introduction. The introduction in a report

introduces the research to its readers. In this section, the objectives of the research are

clearly stated and also, the reasons for which the investigation is taken up. The main

concept involved in a research is introduced and explained properly, so that a l l the terms

and terminologies explained throughout the report are covered in t h i s section.

www.itmuniversityonline.org Page 173


Research Methodology

10. Research Report Writing eBook

Thus, introduction helps the readers to comprehend the purpose and concept of the

research. The introduction always follows after the executive summary. A sample of an

'Introduction' has been given below:

Introduction

As a signatory to the Kyoto Protocol New Zealand has agreed to report, in a transparent

and verifiable manner, greenhouse gas emissions by sources, and removals by sinks,

associated with direct human-induced, land-use change and forestry activities. These

land-use change and forestry activities are limited to afforestation, reforestation, and

deforestation that have occurred since 1990. In order to provide the necessary data to

allow carbon stocks, and changes in carbon stock, to be estimated in accordance with

the recently-adopted Good Practice Guidance for Land-Use, Land-Use Change and

Forestry (IPCC 2003), a national forest inventory specifically designed for carbon

monitoring is being i m p l e m e n t e d . The initial focus of the inventory w i l l be planted Kyoto

compliant forests. These are forests which were established after 1 January 1990 on

land, which did not previously contain forests. Part of the preliminary work associated

with the development of this national inventory consisted of a pilot survey, which was

conducted in the Nelson and Marlborough regions. The purpose of the pilot study was to

test the proposed field methodology and collect sufficient data to be able to produce

i n i t i a l estimates of carbon stocks and stock changes.

Any large-scale survey will include some errors (Merritt et al. 2005). Good practice in

forest inventories means that uncertainty associated with the survey and estimation

should be reduced as far as practicable. Good practice also recognizes that w h i l e there

will be some uncertainty remaining it should be identified. Uncertainty analysis is

concerned, with this identification of credible limits to the accuracy of an estimate

(Cullen and Frey 1999). Moreover, the good practice guide (IPCC 2003) for the

preparation of greenhouse gas inventories stipulates that uncertainties associated with

estimates of sources and removals must be quantified.

In this report, we present estimates of the uncertainty associated with the carbon

estimates from the pilot study to demonstrate procedures for future analysis, when

carbon is assessed at a nationwide scale.

Source: http:/ /www. math .canterbury .ac.nz/research/ucdms2005n8. pdf

www.itmuniversityonline.org Page 174


Research Methodology

10. Research Report Writing eBook

As g i v e n in the above pages, it can be seen that the executive summary is followed by

the introduction of the report. Here, to separate the different sections in the executive

summary, bullet points are being used. Thus, the executive summary can be referred to

as the abstract of the report. However, the introduction of the report is given more

elaborately.

After e x p l a i n i n g the research objective, concept, and purpose, the review of literature is

provided. It helps a reader to compare the research with the context of other similar

researches. The context mentioned should contain the information of its a u t h o r and the

year.

Methodology

Data for a research should be collected in a scientific manner, in order to get valid

results. The methods and techniques used for the collection of data are explained in t h i s

section. The methods used to obtain data are selected, so as to encounter fewer

a m o u n t s of biases that can get incorporated in the different phases of data collection.

The methodology provides the following:

Research Design: This includes the study type, the source of data collection:

primary or secondary, details of how the data is collected and the m e d i u m used in

the research.

Sample Design: It explains the type of sampling design and the sa m p l e size used

in the research for its data collection. An appropriate sampling type is used

depending u p o n the population from which the data is to be collected.

The Fieldwork for Data Collection: The whole process of field data collection

i n c l u d e s the information about 'by whom', 'how' and 'where' the collection of the

data w i l l be done.

Data Analysis and Results

After the methodology of research is clearly stated, the next section states what type of

a n a l y s i s are done to obtain the results for the study. This section is the most important

part of the research. Appropriate data analysis employed, is explained briefly and also,

the reason for its suitability to be applied in the research is stated. If an appropriate

analysis is not used accordingly, depending upon the population u n d e r consideration, in

that case, the result of the research may deviate from its actual f i n d i n g .

www.itmuniversityonline.org Page 175


Research Methodology

10. Research Report Writing eBook

A research without a result or fi n d i n g is not complete or not acceptable. Thus, it is very

much necessary to implement proper analysis and find the result. The interpretation of

the result w i l l indicate whether the hypothesis under consideration is correct or wrong.

Conclusions and Recommendations

This section consists of the judgment of the researcher on the basis of the results

obtained and also the suggestions, regarding the same is provided. That is, the view of

the researcher towards its study is summarized, so that it can communicate the

researcher's words to its readers.

Appended Part

It consists of a l l the material or subsidiary documents related to the research. Technical

documents can also be a d d ed in the appended part. The documents/references that are

u s u a l l y included in this part are:

Data Collection Forms

Detailed Calculations

General a n d Technical Tables

Bibliography

Data Collection Forms

A l l the questionnaires or schedules or proforma used are placed in t h i s section.

Detailed Calculations

To calculate the result, the calculations need to be illustrated and because of the brief

information given in the report, it cannot be discussed, those calculations can be

provided in the appendix. Also, some terminologies mentioned in the report must be

discussed in the a p p e n d ed part.

General and Technical Tables

The statistical or measurement tables, which are used for interpretation of the finding

must be furnished in the a p p e n d i x .

Bibliography

It consists of a l l the references of the research report. Bibliography is also indicated as

references in some reports. The following, Fig. 10.6a, is a bibliography of a report

where a l l the books, articles, and links used for reference are listed.

www.itmuniversityonline.org Page 176


Research Methodology

10. Research Report Writing eBook

BIBLIOGRAPHY

'Bureau of Indian Affairs". 1 2 July 2008. Department of Indian Affairs. 2002

<http ://v,ww. doi .gov/bial>.

"Bureau of Indian Affairs: Quick Facts". , 2 July 2008. Department of Indian Affairs.

2002 <httpl/wv/\v.doi.gov/bia!quick_facts.html>.

Conley, R. J. The Chemkee Nation A History. Albuquerque: Universi1y of New

Mexico Press, 2005.

"Education Facts and History". 18 July 2008. National Indian Education Association.

2002 <http://wv11.v.niea.org/history/research.php>.

Ethridge, R. Creek Country The Creek Indians and Their World Chapel Hill: The

University of North Carolina Press, 2002.

Fenn, E. A., Wood, P . H . , Watson, H. L., Clayton, T. H., Nathans, S., Parramore, T. C.,

et al. The Way We Lived in Notth Carolina. Chapel Hill. NC: The University of

North Carolina Press, 2003.

Gannon, M. FLORIDA A Short History. Gainesville: University Press of Florida, 2003.

"Native Americans - American Indians - The Firs1 People of America". 12 July

2008. Native Americans Websrle. 2007 <http:!!www.nativeamericans.com/>.

Spencer, D. D. Seminole Indians in Old Picture Postcards. Ormond Beach:

Camelot Publising Company, 2002.

Taylor, R. A. FLORIDA: An Illustrated History. New York: Hippocrene Books, Inc., 2005.

Fig. 10.6a: Bibliography

Source: http://img.docstoccdn.com/thumb/orig/ 123795011.png

1 0 . 7 Types of Research Report

A research report can be written in different types, depending on its target au d ien c e.

The report can be of the following types in terms of the presentation of results and

procedure of the research:

Technical report

P o p u l a r report

Article

M onogr aph

Oral presentation

www.itmuniversityonline.org Page 177


Research Methodology

10. Research Report Writing eBook

In these types of research reports, the main purpose is to describe the research

completely, while they only differ in their writing style, that is, the way the whole

procedure of the research is written. Like a study which is for the general population is

presented in a simple but concise way, while a study for an audience c o m p r i s i n g people

who are well aware of the technical terminologies of the subject u n d e r study, the report

in such a case can be more precise and technical.

The types of reports mentioned above are discussed below:

Technical Report

Such kind of report mainly concentrates on the research methodology used,

assumptions required for the research/study, and the also the limitations mentioned in

the study, with supporting evidence or data.

The m a i n sections in a technical report are:

1. S u m m a r y of the s t u d y : A brief overview of the research and its f i n d i n g .

2. Nature of the study: It includes the objectives of the research, research

formulation and the hypothesis. It also includes details about the population

targeted for the study, which supply the required data.

3. Methodology: It includes the methods and techniques used for the collection of

data. This also includes the details of the sample size, type of sampling, and

m e d i u m used for collection of data.

4. Data: This section describes data in detail and its characteristics.

5. Analysis of data and interpretation of results: The data collected are analyzed by

suitable statistical tools for appropriate output. This output or result's

interpretation is the answer to the research question formulated. The

interpretation of the result completes the whole research.

6. Conclusion: A detailed summary of the result and suggestions drawn from the

results are included in this section.

7. Bibliography: The sources consulted in the study are listed in t h i s section.

8. Technical appendices: It includes all the documents related to the research. Like

questionnaires/schedules, tables/techniques that are used in analysis of the data

utilized in the research are given in this section.

www.itmuniversityonline.org Page 178


Research Methodology

10. Research Report Writing eBook

Popular Report

It is rightly named 'popular' because such a report is prepared in a generalized manner

for a mass a u d i e n c e . As it is accessed by population from different backgrounds, it must

be in very s i m p l e and an understandable language. It contains more flowcharts, graphs,

and figures to e x p l a i n the research in simple language but in detail.

The following sections are to be included while preparing a popular report:

1. Summary: This section throws light to the generalized finding of the research and

its i m p l e m e n t a t i o n in the practical world.

2. Recommendations for the research: Based on the results, the recommendations

that are to be suggested are included in this section.

3. Objective of the study: Specific objectives for the research are given in this

section.

4. Methodology: The techniques and methods used for the research are included in

t h i s section. The details are given in such a way that it is easily understood and it

does not contain technical terms which are not practically used.

Article

An article is a short write-up that is published in a newspaper, magazine or j o u r n a l . Even

t h i s is for a mass a u d i e n c e . It is short, attractive, and less formal in its writing style.

It m a i n l y i n c l u d e s the following sections:

1. Title: An attractive title to gain the attention of its readers.

2. Introduction: A clear review of the research is included in a short and s i m p l e way.

3. Main body: It comprises two to five paragraphs describing the details of how the

research is done.

4. Conclusion: It gives the final interpretation of results or comments regarding the

study.

www.itmuniversityonline.org Page 179


Research Methodology

10. Research Report Writing eBook

Monograph

In comparison to all the reports mentioned above, this is the most detailed write-up,

which is technically written for a specific subject. The main objective of such a report is

to provide i n s i g h t to the topic under study and be more informative. The writer of such a

report must make sure that, the topic considered for the study is not established earlier

in any of the studies, that is, it should be a unique one. However, it can be an

advancement of the results of a previous study.

The target reader for a monograph is very limited because it is subject-specific and not

g en era l .

Oral Presentation

An oral presentation is also an essential part of report presentation to e x p l a i n a study to

its clients/readers. It such reports, the researcher can highlight the importance,

objectives, and results of the study in a precise way by using more g r a p h s , tables, and

flowcharts. Since it is a face-to-face presentation, readers get a scope for clearing their

doubts by asking the researcher, relevant questions. It has an advantage that there is

an active interaction between the researcher and readers.

www.itmuniversityonline.org Page 180

Potrebbero piacerti anche