RM Project Final

CHAPTER I
INTRODUCTION
Q.1: Explain the concept of processing of data and analyse in detail the
different stages in the data processing?
Data Processing
Data reduction or processing mainly involves various manipulations necessary for preparing the
data for analysis. The process (of manipulation) could be manual or electronic. It involves
editing, categorising the open ended questions, coding, computerisation and preparation of
tables and diagrams.
Checking and Editing

Information gathered during stage of data collection varies in nature and quantity from study to
study. For example, when surveys are conducted and data obtained through questionnaires and
scheduled, the answers either may not be ticked at proper places, or some questions may be left
unanswered, or may be given in a form which need reconstruction in a category designed for
analysis, e.g., converting daily/monthly income in annual income or indentifying family structure
(nuclear/joint) on the basis of kin living together and functioning under the common authority,
and so on. Suppose in a business research, in one question is your industry one of the largest or
about average in size, or small, the respondent ticked both large and average and writes
average in sales but one of the largest in chain of chemical industries. The researcher has to
take a decision as to how to edit it and whether to identify it as largest or average industry.
Checking also needs that data are relevant and appropriate and errors are modified.
Occasionally, the investigator makes a mistake and records and impossible answer. How much
red chilli do you use in a month? The answer is written 4 kilos. Can a family of three use four
kilos in a month? The correct answer would be 0.4 kilos. Similarly, an answer to the question
How much money do you spend in a year on education of your children says Rs. 30,000.
This answer will be confusing if the respondent says his monthly income is Rs. 5000. A family
which educates its children in a costly private school cannot survive on a monthly expense of Rs.
2500. Such answers need editing.
STAGES IN DATA ANALYSIS

DATA PROCCESSING
EDITING
COMPUTER FEEDING
CODING
DATA DISTRIBUTION
TABULATION
UNIVERATE
MULTIVARATE
BIVARATE
DATA ANALYSIS
CATEGORISATION
MEASUREMENT
FREQUENCY DISTRIBUTION
DATA INTREPRETATION
DIAGRAMATIC REPRESENTATION
Editing is required for proper coding and entering in the computer (when decision is taken not to
analyse the data manually). Editing thus means that the data are complete, error-free, readable
and worthy of being assigned a code. Editing process brings in the field itself. Interviewers, soon
after completing the interviews should check the completed forms for errors and omissions. They
can complete the incomplete responses and reduce the number of no responses with the rapid
follow-up, stimulated by field editing. In many cases, field editing may not be possible. In such
cases, in-house editing may help.
Editing also occurs simultaneously with forming categories, e.g. age given by the respondents
may be put in the category of below 18 years (very young), 18-30 years (young), 30-40 years
9early middle-aged), 40-50 years (late middle aged) and above 50 years old (old). Field
supervisors can do editing in the field itself by re-contacting the respondents. Editing can be
done along with coding too.
Editing also requires re-arranging answer re-arranging answers to open-ended questions.
Sometimes dont know answer is edited to no response. This is wrong. Dont know means the
respondent is not sure and is in a double mind about his reaction or is not able to formulate a
clear-cut opinion, or considers the question personal and does not want to answer it. No response
means that the respondent is not familiar with the situation/object/individual about which he is
asked.
Coding of Data
Coding is translating answers into numerical values or assessing numbers to
the various categories of a variable to be used in data analysis. Coding is
generally done while preparing the questions and before finalizing the
questionnaires and interview schedules. Fieldwork is thus done with
precoded questions. However, sometimes, when questions are not precoded,
coding is done after the fieldwork. Coding is done on the basis of the
3
instructions given in the codebook. The code book gives a numerical code for
each variable.
Coding is done by using a code book, code sheet and a computer card. Code
book explains how to assign numerical codes for response categories
received in the questionnaire/schedule. It also indicates the location of a
variable on computer cards. Code sheet is a sheet used to transfer data from
original source (i.e. questionnaire/schedule, etc.) to cards.
They are
prepared by the researcher for assigning codes to the answers received.

Code sheets are like computer cards. These sheets are given to key-punchers
who then transfer the data to cards. The computer card has 80 columns
horizontally and 9 columns vertically (from the top to the bottom of the
card). It is used for storing data or talking to computers. For example, in a
question about the religion of the respondent, the answer categories, viz.,
Hindu, Muslim, Sikh, Christian, SC, ST. will be substituted by 1, 2, 3, 4, 5,
6 respectively and counting of frequencies will refer not to Hindus or
Muslims. This is because computers easily handle numbers than words.
Coding uses categories that are mutually exclusive and uni-dimensional. The
first 3 or 4 columns in the card (depending on the total number of
respondents) are left blank for respondents identification number. We can
take the following example for understanding the preparation of the code
book and the code sheet.
The data is then transferred from questionnaires to computer cards by using
a keypunch machine. The key-punch machine does not leaving a hole over a
particular number in a specific column. Data are then considered machine
readable.
DATA DISTRIBUTION
Distribution of data is important in the presentation of data. A distribution is
the form of classification of scores obtained for the various categories of a
particular variable (Sarantakos, 19989:343). There are three types of
distributions: (i) frequency distributions, (ii) percentage distributions, and (iii)
4
cumulative distributions. In social research, frequency distributions are most

common.
(i)
Frequency distribution: it presents the frequency of occurrence of

certain
categories.
This
distribution
appears
in
two
forms:
ungrouped and grouped. In ungrouped form, the scores are not

collapsed into categories, e.g., distribution of ages of the students
of an MBA class, each age value (e.g., 20, 22, 24, and so on) will be
presented separately in thedistribution. In a grouped distribution,
the scores are collapsed into categories, so that 2 or 3 scores are
(ii)
presented together as a group.

Percentage Distribution: It is also possible to give frequencies not in
absolute numbers but in percentages. For instance, of the 1,383
users, 15.1 per cent had monthly income of less than Rs. 500 (in
1976), i.e., they belonged to low income group, 24.6 had family
income of Rs. 500-1000 (i.e., they belonged to middle income
group) and 60.3 per cent had family income of Rs. 1000 (i.e., they
belonged to upper income group.) It is also possible to convert
these figures into proportions, e.g., the ratio of female users to male
users was 126:1257 or 1:10. This distribution is useful in comparing
(iii)
cases. This appears both in grouped and ungrouped forms.

Cumulative distribution: It does not contain in each item the
observations that fall in the relevant category (as in the above two
types of distribution) but consists of number of cases up to and including a specified
scale value. This distribution appears in grouped and ungrouped forms.
Tabulation of Data
After editing, which ensures that the information on the schedule is accurate
and categorized in a suitable form, the data are put together in some kind of
tables and may also undergo some other forms of statistical analysis. There
5
is nothing like statistical sophistication in tabulation. It amounts to no more

than counting of the number of cases falling into each of several categories.
Thus,
when
distribution
is
adding
all
the
schedules
together
(in
frequencies, percentages and averages), tabulation is not only total adding

but counting frequencies in each category.
Tables can be prepared manually and/or by computers. For a small study of
100 to 200 persons, there may be little in tabulating by computer since this
necessitates putting the data on punched cards. But for a survey analysis
involving a large number
respondents and requiring cross tabulation
involving more than two variables, hand tabulation will be inappropriate,

time consuming and unwieldy. When the data are put on the punched cards,
construction of tables is easy and speedy. The machine tabulation has one
another advantage also. Suppose, the researcher is working on sociology of
earthquakes (disasters) in India I last 20 years (between 1980 and 2000). He
may have done tabulation of years, numbers of earthquakes, magnitude, and
death toll as given in Table 1. But tabulation may not have done in terms of
zones involved (i.e., a 3 way tabulation)
DIAGRAMMATIC REPRESANTATION
At one time, diagrams and graphs were given much importance in report
writing. However, today, these are not considered much important in
research report. In Ph.D. and D.Lit. theses, these are the even avoided.
Nevertheless, we can understand some of the diagrams and graph used in
the reports. These are: graphs, histograms, bar diagrams, pie charts,
pyramids and pictographs.
Graphs
Graphs offer a visual presentation of the results. The horizontal line is the
axis and vertical line intersecting it, is y-axis or ordinate. The point of
intersection is the origin. The values of independent variable are scaled on
the x-axis and of the dependant variable on the y-axis. Graph 1 shows the
number of the cognizable crimes in India in last 40 years.
Sometimes, the multiple line graph is also used for indicating comparisons
between two or more elements as shown in the Graph 2.
Histograms
In histogram, the values of variables are presented in vertical bars drawn
adjacent to each other, as shown in Diagram 3. The difference between
graph and histogram is that in graph, points are plotted.
Data Analysis and Interpretation

The analysis in the ordering of data into constituent parts in order to obtain
answers to research questions. For example, a researcher formulates a
hypothesis pertaining to relation between high educational level and positive
attitude towards a certain phenomenon (and vice versa). He conducts a
study and gathers data from the respondent in a college/university. He then
breaks down the data and so orders them that he can obtain an answer to
the question: does high education change attitudes? However, merely
analysis does not provide answers to research questions. Interpretation of
data is also necessary. Interpretation takes the results of analysis, makes
inferences and draws conclusion about the relationship. Thus to interpret is
to explain, to find meaning.
Stages in Analysis
The analysis of a research is done in four stages. These are (i)categorization,
(ii) frequency distribution, (iii) measurement, and (iv) interpretation.
Categorization
7
Categories are set up according to the research problem and purpose of

study. These are mutually exclusive, independent and exhaustive.
Frequency distribution
Frequency distribution is the tabulation of quantitative data in classes. It
indicates the number of cases or distribution of cases falling into different
categories. Frequency distribution is of two types: primary and secondary.
Primary analysis(or distribution) is descriptive and only gives the number of
cases in each class. Secondary analysis (or distribution) is comparison of
frequencies and percentages. Secondary analysis is thus concerned with
relations, e.g., comparing the frequency of men with women or educated
with illiterate, or rural with urban, and so on.
Measurement
Measurement could be in the form of central tendencies (i.e., calculating
mean, mode and median) or statistical averages. The mean is the arithmetic
average of a set of measures. The median is the midmost measure of any set
of measures. The mode is the most frequently occurring measure of a set of
measures.
Measurement could be in terms of coefficient of correlation(s). The reliability
and validity of the measures of the variables is important in all social
research. The whole interpretation can collapse on this point alone. The
statistical analysis sometimes may be of univariate type (examining one
variable at a time), sometimes of bivariate type (assessing relationship
between two variables) and sometimes of multivariate type (analyzing three
of more variables simultaneously)
There are four scales used for measurement: nominal, ordinal, interval and
ratio. The nominal scale is merely a classificatory scale in which a number is
assigned to each object for identification. The ordinal scale consists of raking
of the objects. The interval scale is like an ordinal scale plus the fact that the
intervals or the discus between numbers on the numbers on the scale are
equal. The ratioscale is used for determining ratios of the numbers assigned
to categories.
Interpretation
Interpretation of data can be descriptive or analytical or it can be from a
theatrical standpoint. Negative results are much harder to interpret than the
positive results (i.e., when the data support the hypotheses).
Q.2:Enumerate in detail transcription and graphical representation.

VISUAL DATA PRESENTATION
Data are the set of characteristics associated with the AUs of interest. Data for public School
teachers may consist of a single characteristic, such as salary, or a set of characteristics, such as
salary, years of experience, highest academic degree, and opinion on a fiscal issue to appear on a
ballot in an upcoming election. To this point, the researcher has been given an overview of the
sampling process (with greater detail coming later) and an introduction to the scales of
measurement associated with characteristics of interest. Numerical aspects of dealing with data
are the subject of the remaining chapters of this HANDBOOK. In this section, possible
approaches to presenting data visually for purposes of a briefing or for a report are suggested.
These few procedure is by covered are tabular and they should make it easier for the reader to
review the data. To be covered are tabular and graphic methods.
9
Tabular Presentation Methods

Clarity in a table summarizing a set of data is essential. The reader (or listener) should not have
to probe too deeply to understand what is being presented.
The variable of interest in the annual contract salary for public school teachers during the school
year 1975-76. Because 1,311 teachers in the sample the data have been grouped into eight classes
or intervals, with the lowest salary interval being from $5,000 to$6,999. Clearly, from the
exhibit, the interval size is @2,000.
Exhibit
Annual contract salary for a nationwide sample of public school teachers for school year 1975-76
Annual
frequency
Salary
(1)
(2)
midpoint
limit
(3)
(4)
relative
Frequency
(5)
20,999.5
19,000-20,999
43
19,999.5
cumulative relative
Frequency
(6)
1.0001
0.033
18,999.5
10
0.968
17,000-18,999
98
17,999.5
0.075
16,999.5
15,000-16999
125
15,999.5
0.893
0.095
14,999.5
13,000-14,999
179
13,999.5
0.798
0.137
12,999.5
11,000-12,999
275
0.661
11,999.5
0.210
10,999.5
9,000-10.999
363
9,999.5
0.45
0.277
8,999.5
7,000-8,999
224
7,999.5
0.171
6,999.5
6,000-6,999
5,999.5
0.003
4,999.5
TOTAL
1,311
11
1.001
Colum2 indicates the frequency, or number, of teachers with contract salaries in each of the eight
intervals. For example, 98 teachers earned salaries between $17,000 and $18,999. Columns 1 and
2 taken together indicate the frequency with which individual teacher salaries fall within each of
the intervals and are referred to as a frequency distribution.
Because these data have been grouped, it is impossible to identify where within this interval each
of the 98 teachers is located. Two assumptions are usually made in dealing with grouped data.
The AUs within the interval have values of the variable that average out the interval
midpoint (column 3).

The AUs within the interval have values of the variable that are every spread out
between the lower and higher limits of the interval (column 4)
Midpoints for the intervals are obtained by averaging the lower upper value is interval. (For
example, the midpoint of the interval with 275 teachers salaries is average of $12,999 Row 5
or $11,999.5). Successive midpoints appeared in column 3
Limits dividing the successive intervals are halfway between the upper value in one interval and
the lower value in the next higher interval. For example, the limit dividing the intervals with 179
12
and 125 teachers is halfway between $14,999 (Row 4) and $ 15,000 (Row 3), clearly at
$14,999.50. Successive interval limits are given in column 4.
Frequently, it is of interest to discuss data in terms of percentages or proportion of AUs lying

within a given interval, say, referred to as relative frequencies or proportions. For example,
0.095, or 9.5% of the teachers had salaries between $15,000 and $16,999 (Row 3). The entries in
column 5 are obtained by dividing the entries in Column 2 by the total of Column 20. The total
for Column 5 should equal 1.00, but due to rounding may be off slightly, as it is in this example.
Though not usually presented in a report, the entries in column 6 are necessary develop one of
the graphic methods presented in the next section. Column 6, most frequently referred to as the
cumulating relative frequency distribution, is obtained by cumulating successively the entries
in column 5, starting with the lowest value and moving toward the highest value of the variable
of interest. These successive cumulations are then placed besides the successive limits dividing
adjacent intervals. For example, the entry of 0.451 in the last column opposite the entry of
$10.999.50 was obtained by adding together 0.00. +0.171+0.277. This quantity, 0.451, is the
proportion of teachers in the sample with salaries less than @10,999.50, or 45.1% of the teachers
in the sample fall in the first three class intervals.
Columns 1, 2, and 5 are those most frequently encountered in a tabular display data of this type,
namely, a variable. Characteristics that are attributes lend them to slightly different tables.
13
The attribute of interest in
Graphic presentation methods

While clarity and lack of ambiguity are also important in using to portray, a set of data, they are
not quite as critical. The reviewer still needs to understand the worlds that appear with the
graphs, but impressions are perhaps as important as exactness in reviewing graphs. The variable,
annual contract salary, presented in 1st graphically presented in a number of ways, the three most
common of which are as a proportion of relative frequency distribution.
14
Presenting results
Frequencies and numbers can often presented more clearly in charts than in words. Consider, for
figure, which present the
4
2
1
.
8
7
1
9
.
.
0
4
2
0
3
%
0
%
< 29 year
30-39 year
40-49 year
50-59 year
>60 year
15
Sample size is a longitudinal study at three times of measurement

An alternative is to use pie charts. Above figures provides an example in which the age
distribution of homeless people in Germany is summarized in age groups(32-35 years).
A third means of presentation is to use tables. The above table provides an example. Here, the
frequencies of responses to the question of how to prevent diseases are summarized according to
the age of children, who could give more than one answer.
The example demonstrates how you can present findings visually so that they become apparent
for readers at first sight. The three methods above are, not, of course the only methods available;
they are included here merely to illustrate the usefulness of visual presentation.
16
Q.3:What is analysis of data? Discuss in detail the purpose,

characteristics and the various types of the analysis of data.
Introduction to analysis of data.
Analysis may be defined as classifying, ordering, manipulating and summarizing data with the
view to answering the research question. The raw data have first to be rendered in terms of the
research variables in order to constitute research data. The research variables in order to
constitute research data. The research variable, in most behavioral research, is conceptual which
has to be made operational in order to be amenable to the process of observation and
measurement and the operationalization of the research variable has to be determined by the
nature of the problem. It is the research problem that determines the kind of data into which the
raw data have to be rendered and also the kind of analysis to which the data should be subjected.
The analysis of data is the most skilled task in the research process. It calls for the
researchers own judgment and skill. Analysis means a critical examination of the assembled and
grouped data for studying the characteristics of the subject under study and for determining the
patterns of relationships among the variables relating to it. Both quantitative and non-quantitative
methods are used. However, social research most often requires quantitative analysis involving
the application of various statistical techniques.
Analysis of data is the most skilled task of all the stages of the research. It is a task which
demands the researchers own judgment and skill. It should be done by the researcher himself. A
correct analysis needs familiarity with the background of the survey and all the stages of
research. The analysis does not necessarily be a statistical one. Quantitative and non-quantitative
methods of analysis can be done.
The steps followed in analysis of data will vary on the basis of the type of study. A part of
analysis is a matter of working out statistical distribution, constructing diagrams, calculating
17
simple measures like averages, measures of dispersion, percentages, correlation etc. Hence
statistical analysis forms a part of survey analysis.
The problems raised by the analysis of data are directly related to the complexity of the
hypothesis. Problems of data analysis involve the entire questions raise in research design, for
secondary analysis to involve the designing and redesigning of substitutes for the controlled
experiment.
After collecting the data from a representative sample of the population, the next step is to
analyze them to test the research hypotheses. However, before analyzing the data to test
hypotheses, some preliminary steps need to be completed. This will help to ensure that the data
are reasonably good and assured of good quality for further analysis. There are four steps namely
1)
2)
3)
4)
Getting ready for analysis.

Getting a feel for the data.
Testing of goodness of data.
Testing the hypotheses.
According to Johan Galtung, The two phases of research operations are:

a) Processing of data refers to concentrating, recasting and dealing with data in such a way
that they become as amenable to analyze as possible
b) Analysis of data may be considered as having a reference to the process of viewing the
data in the light of hypotheses or research questions, as also, the prevailing theories and
drawing conclusions that will make some contribution in the matter of theory formulation
or modification.
According to Selltiz and others, Analysis of data does not make such a precise differentiation.
Analysis is a comprehensive process, involves processing.
The dividing line between analysis of data and interpretation of data is difficult to draw. These
two are symbolic and merge imperceptibly. If analysis involves organizing the data in a
particular manner, it is mostly the interpretative ideas that govern this task. If the end product of
analysis is the setting up of certain general conclusions, then what these conclusions really mean
and reflect is the bare minimum that the researcher must feel obliged to know. Interpretation is
18
the way to this knowledge. Thus the task to analysis can hardly be said to be complete without
interpretation coming to illuminate the results.
Analysis of data is the most skilled task of all stages of the research. It is a task calling for the
researchers own judgment and skill. It should be done by the researcher himself. Proper analysis
requires a familiarity with the background of the survey and with all its stages. The analysis does
not necessarily be statistical as both quantitative and non- quantitative (Qualitative) methods can
be used.
The steps in the analysis of data depend upon the type of study. In case, there is a set of clearly
formulated hypotheses, then each hypothesis can be seen as a work prescribing a certain action to
be taken vis--vis the data. The more specific the hypothesis, the more specific action. In such a
study, analysis of data is almost completely a mechanical procedure. Part of the analysis is
working out statistical distribution, construction of diagrams and calculating simple measures
like average, measures of dispersion, percentage correlation etc. Hence statistical analysis form a
part of survey analysis. The analysis means verification of hypotheses.
19
Purpose of Data Analysis

Statistical analysis of data serves several major purposes. It summarizes large mass of data into
understandable and meaningful form. This is the role of Statistics. The reduction of data
facilitates further analysis. Statistics makes exact descriptions possible. For example, when we
say that the educational level of people in X district is very high, the description is not specific;
but when statistical measures like the percentages of literate among males and females, and the
like are available, the description becomes exact.
Statistical analysis facilitates identification of the casual factors underlying complex phenomena.
What are the factors which determine a variable like labour productivity or academic
performance of students? What are the relative contributions of the causative factors? Answers
to such questions can be obtained from statistical multivariate analysis.
Statistical analysis aids the drawing of reliable inferences from observations. Data are collected
and analyzed in order to predict or make inferences about situations that have not been measured
in full. What can be the growth rate of industrial production during the coming year? What
would be the probable demand for a particular product in the coming year? Questions of this
kind require predictions of future states to be made on the basis of current knowledge. Such
predictions are essential in any strategic decision relating to management of an enterprise or the
national economy or a social action forum. The statistical prediction is one of the functions in
inferential statistics.
Statistical analysis also helps making estimations or generalizations from the results of sample
surveys. This is another function of inferential statistics. Sample statistics based on probability
samples may give good estimates of particular population parameters. Any estimate will deviate
from the true value due to sampling error. The process of statistical inference enables us to
evaluate the accuracy of the estimates. Inferential statistical analysis is useful for assessing the
significance of specific sample results under assumed population conditions. This type of
analysis is called hypothesis testing.
20
Characteristics of Analysis of Data

Following are the main characteristics of analysis of data:
Analysis of data is one of the most important aspects of research. Since it is highly
skilled and technical job, it should be carried out by the researcher himself or under
his also supervision. It demands a deep and intense knowledge on the part of the
researcher about the data to be analyzed. The researcher should also possess judgment
skill, ability of generalization and should be familiar with the background objects and
hypothesis of study.
Data, facts and figures are silent and they never speak for themselves but they have
complexities. It is through systematic analysis that the important characteristics which
are hidden in the data are brought out and valid generalizations are drawn. Analysis
demands a thorough knowledge of ones data. Without deep knowledge, the analysis
is likely to be aimless. It is only by organizing , analyzing and interpreting the
research data that we know their important features, inter-relationship and cause-effect
relationship. The trends and sequences inherent in the phenomena are elaborated by
means of generalization.
According to P. V. Young, The function of systematic analysis is to build an
intellectual edifice in which properly sorted and shifted facts and figures are placed in
their appropriate settings and broader generalizations beyond the immediate contents
of the facts under study, consistent relationships, or that general inferences can be
drawn from them the aim of a mature science.

The data to be analyzed and interpreted should :
i)be reproducible,
ii) be readily disposed to quantitative treatment,
iii) have significance for some systematic theory, and can serve as a basis for broader
generalizations.
21
We should remember that the steps envisaged in the analysis of data will vary
depending on the type of study. A set of clearly formulated hypothesis to start with the
study presents a norm prescribing a certain action to be taken. The more specific is the
hypothesis, the more specific is the action and in such types of studies, the analysis of
data is almost completely a mechanical procedure.

If the data are collected according to vague clues rather than according to the specific
hypothesis, the data are analysed inductively or invested during the process and not by
means of new prescribed set of rules.

The task of analysis in complete without interpretation. In fact, analysis of data and
interpretation of data are complementary to each other. The end product of analysis is the
setting up of certain general conclusions while the interpretation deals with what these
conclusions really mean.

Since analysis and interpretation of data are interwoven the interpretation should more
properly be conceived of as a special aspect of analysis rather than a distinct operation.
Interpretation is the process to establish relationship between variables which are
expressed in the findings and why such relationships exists.

For any successful study the task of analysis and interpretation should be designed before
the data are actually collected with the exception of formulative studies where the
researcher had no idea as to what kind of answer he wants. Otherwise there is always a
danger to being too late and the chances of missing important relevant data.
The most difficult task in the analysis and interpretation of data is the establishment of cause
and effect relationship especially in the cases of social and personal problems. Research
problems do not necessarily have one factor or a set of factors but they arise due to a
complex variety of factors and sequence. Karl Pearson has observed, No phenomena or
stage in sequence has only one cause; all antecedent stages are successive causes when we
scientifically state causes we are really describing he successive stages of a routine of
experience.
In fact, human behaviour cannot be reduced or explained with the help of cause effect
sequences as we face difficulties in detecting the factors and in establishing cause and effect
relationships because nature of these factors differ from one individual to another and due to
the fact the cause and effect both are inter-dependent, i.e., one stimulates the other.
22
Types of Analysis
Analysis of survey or experimental data involves estimating the values of unknown parameters
of the population and testing of hypotheses for drawing inferences. Analysis may be categorised
as:
1. Descriptive Analysis:
It is largely a study of distributions of one or more variable. Such study provides with profiles of
a business group, work group, persons or others subjects on any of a multitude of characteristics
such as size, composition, efficiency, preferences etc. Various measures that show the size and
shape of distribution along with the study of measuring the relationship between two or more
variables are available from this analysis.
2. Inferential Analysis:
It is concerned with the various tests of significance for testing hypotheses in orderto determine
with what validity the data can indicate some conclusion or conclusions. It is also concerned with
the estimation of population values. It is mainly on the basis of inferential analysis that the task
of interpretation is performed.
3. Correlation Analysis:
It studies that joint variation of two or more variables for determining the amount of correlation
between two or more variables.
4. Casual Analysis:
It is concerned with the study of how one or more variables affect changes in another variable. It
is a study of functional relationship existing between two or more variables.
5. Multivariate Analysis:
23
With the availability of computer facilities, there is a development of multivariate analysis which
means use of statistical methods which analyse more than two variables on a sample of
observations. These include:
a) Multiple Discriminate Analysis:
It is suitable when the researcher has a single dependent variable that cannot be measured, but
can be classified into two or more groups on the basis of some attribute. The objective of this
analysis happens to be to predict an organizations possibility of belonging to a particular group
based on several predictor variables.
b) Multiple Regression Analysis:
It is suitable when the researcher has one dependent variable which is presumed to be a function
of two or more independent variables. The objective of this analysis is to make a prediction about
the dependent variable based on its covariance with all the concerned independent variables.
c) Multivariate Analysis of Variance (Multi- Anova):
This analysis is an extension of two way ANOVA, where in the ratio of among group variable to
within group variance is worked out on a set of variables.
d) Canonical Analysis:
This analysis can be used in case of both measurable and non-measurable variables for the
purpose of simultaneously predicting a set of dependent variables from their joint covariance
with a set of independent variables.
24
Q.5: What are the various methods of tabulation and explain the
significance or processing of the data. Discuss the role of computer
in data processing and analysis? Explain the need of statistical
techniques in the research.
Tabulation
The process of placing classified data into tabular form is known as tabulation. A table is a
symmetric arrangement of statistical data in rows and columns. Rows are horizontal
arrangements whereas columns are vertical arrangements. It may be simple, double or complex
depending upon the type of classification.
Basic description
A table consists of an ordered arrangement of rows and columns. This is a simplified description
of the most basic kind of table. Certain considerations follow from this simplified description:
the term row has several common synonyms (e.g., record, k-tuple, n-tuple, vector);
the term column has several common synonyms (e.g., field, parameter, property,
attribute);
a column is usually identified by a name;
a column name can consist of a word, phrase or a numerical index;
the intersection of a row and a column is a cell.
25
The elements of a table may be grouped, segmented, or arranged in many different ways, and
even nested recursively. Additionally, a table may include metadata, annotations, header,[6]footer
or other ancillary features.
Simple table
The following illustrates a simple table with three columns and six rows. The first row is not
counted, because it is only used to display the column names. This is traditionally called a
"header row".
An example of a table containing rows with summary information. The summary information
consists of subtotals that are combined from previous rows within the same column.
The concept of dimension is also a part of basic terminology.[7] Any "simple" table can be
represented as a "multi-dimensional" table by normalizing the data values into ordered
hierarchies. A common example of such a table is a multiplication table.
Multiplication table
26

1
2
3
1
1
2
3
2
2
4
6
3
3
6
9
NOTE: Multidimensional tables, 2-dimensional as in the example, are created under the
condition the coordinates or combination of the basic headers (margins) give a unique value
attached. This is an injective relation: each combination of the values of the headers row (row 0,
for lack of a better term) and the headers column (column 0 for lack of a better term) is related to
a unique value represented on the table:
column 1 and row 1 will only correspond to the value 1 (and no other)
column 1 and row 2 will only correspond to the value 2 (and no other), etc.
If the said condition is not present, it is required to insert extra columns or rows which increases
the size of table with plenty of empty cells.
To illustrate how a simple table can be transformed into a multi-dimensional table, consider the
following transformation of the Age table.
Modified Age Table (names only)
+
1
Nancy
Nancy Davolio
Justin
Justin Saunders
2
Nancy Klondike
Justin Timberland
3
Nancy Obesanjo
Justin Daviolio
This is structurally identical to the multiplication table, except it uses concatenation instead of
multiplication as the operator; and first name and last name instead of integers as the operands.
Wide and Narrow Tables
Tables can be described as wide or narrow in format. Wide format has a separate column for each
data variable, a Narrow format will have one column for all the variable values and another
column for the context of that value. See Wide and Narrow Data.
27
Importance Of Tabulation
There are no hard and fast rules for preparing a statistical table. Prof. Bowley has rightly pointed
out In collection and tabulation, common sense is the chief requisite and experience is the chief
teacher. However, the following points should be borne in mind while preparing a table.
(i) A good table must contain all the essential parts, such as, Table number, Title, Head note,
Caption, Stub, Body, Foot note and source note.
(ii) A good table should be simple to understand. It should also be compact, complete and selfexplanatory.
(iii) A good table should be of proper size. There should be proper space for rows and columns.
One table should not be overloaded with details. Sometimes it is difficult to present entire data in
a single table. In that case, data are to be divided into more number of tables.
(iv) A good table must have an attractive get up. It should be prepared in such a manner that a
scholar can understand the problem without any strain.
(v) Rows and columns of a table must be numbered.
(vi) In all tables the captions and stubs should be arranged in some systematic manner. The
manner of presentation may be alphabetically, or chronologically depending upon the
requirement.
(vii) The unit of measurement should be mentioned in the head note.
(viii) The figures should be rounded off to the nearest hundred, or thousand or lakh. It helps in
avoiding unnecessary details.
(ix) Percentages and ratios should be computed. Percentage of the value for item to the total must
be given in parenthesis just below the value.
(x) In case of non-availability of information, one should write N.A. or indicate it by dash (-).
28
(xi) Ditto marks should be avoided in a table. Similarly the expression etc should not be used in
a table.
Significance of Data Processing

Data processing is very important to businesses and companies nowadays. This is because the
processing of data converts all relative information and data in a readable manner. Also,
companies need a standardized format for all the information that they need so processing can
really help them. With data processing, your company can face the challenges and competition
among other companies in your field because you can concentrate more on the productive
activities that your company should do. Data processing services take care of the non-core
activities such as conversion of data, data entry, and data processing itself. Data processing will
convert all information into a standard electronic format so that you can use it to help you decide
on the important things immediately. Your high goals can now be achieved since you can now
focus more on making your company very competitive.
Data processing services typically includes form processing, check processing, insurance claims
processing, and image processing. These may seem very minor to your company but they can
give you high impact in the market. Form processing will help you access all the necessary
information faster and easier because the forms will be available in a way that they are easy to
understand. These forms include vouchers, invoices, HTML, resumes, tax forms, different kinds
of survey, and legal and email forms.
Check is the basic transaction unit in all businesses thus making it very important in the
company. Check processing will help you ensure that checks are properly processed and
accomplished so that your company's reputation will not be affected.
Insurance also plays an important role in your company. Losses incurred by your company are
insured through insurance companies and you can reimburse these losses just by processing the
insurance claims. Getting help from the professionals can help you save time and effort and will
allow you to do your own job in the company.
29
Image processing may be a minor job but it can greatly affect the marketing of your company.
Making high quality images and putting them in catalogs and brochures will surely get the
attention of your target clients and customers.
There are many benefits that you can get from data processing. First, the important data in your
company will be converted into a standard format that can be understandable to you and your
employees. Since all the sets of information are in standard electronic format, you can make a
back up copy that you can use in case of data loss. These sets of information are ensured to be
accurate so that you can make your decision correctly. Lastly, you will save more time, effort,
and money because of data processing. You can also say goodbye to lost opportunities.
Role of Computer in Analysis and Processing

In the last 15 years there has been a proliferation of computer software packages designed to
facilitate qualitative data analysis. The programs can be classified, according to function, into a
number of broad categories such as: text retrieval; text base management; coding and retrieval;
code-based theory building; and conceptual-network building. The programs vary enormously in
the extent to which they can facilitate the diverse analytical processes involved. The decision to
use computer software to aid analysis in a particular project may be influenced by a number of
factors, such as the nature of the data and the researcher's preferred approach to data analysis
which will have as its basis certain epistemological and ontological assumptions. This paper
illustrates the way in which a package called NUD.IST facilitated analysis where grounded
theory methods of data analysis were also extensively used. While highlighting the many
benefits that ensued, the paper illustrates the limitations of such programs. The purpose of this
paper is to encourage researchers contemplating the use of computer software to consider
carefully the possible consequences of their decision and to be aware that the use of such
programs can alter the nature of the analytical process in unexpected and perhaps unwanted
ways. The role of the Computer Assisted Qualitative Data Analysis (CAQDAS) Networking
Project, in providing up-to-date information and support for researchers contemplating the use of
software, is discussed.
30
31
Need of Statistical Technique in Research

Statistics are helpful in analyzing most collections of data. This is equally true of hypothesis
testing which can justify conclusions even when no scientific theory exists. In the Lady tasting
tea example, it was "obvious" that no difference existed between (milk poured into tea) and (tea
poured into milk). The data contradicted the "obvious".
Real world applications of hypothesis testing include:[10]
Testing whether more men than women suffer from nightmares
Establishing authorship of documents
Evaluating the effect of the full moon on behavior
Determining the range at which a bat can detect an insect by echo
Deciding whether hospital carpeting results in more infections
Selecting the best means to stop smoking
Checking whether bumper stickers reflect car owner behavior
Testing the claims of handwriting analysts
Statistical hypothesis testing plays an important role in the whole of statistics and in statistical
inference. For example, Lehmann (1992) in a review of the fundamental paper by Neyman and
Pearson (1933) says: "Nevertheless, despite their shortcomings, the new paradigm formulated in
the 1933 paper, and the many developments carried out within its framework continue to play a
central role in both the theory and practice of statistics and can be expected to do so in the
foreseeable future".
Significance testing has been the favored statistical tool in some experimental social sciences
(over 90% of articles in the Journal of Applied Psychology during the early 1990s). [11] Other
32
fields have favored the estimation of parameters (e.g., effect size). Significance testing is used as
a substitute for the traditional comparison of predicted value and experimental result at the core
of the scientific method. When theory is only capable of predicting the sign of a relationship, a
directional (one-sided) hypothesis test can be configured so that only a statistically significant
result supports theory. This form of theory appraisal is the most heavily criticized application of
hypothesis testing.
33

RM Project Final

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

RM Project Final

Caricato da

Copyright:

Formati disponibili

CHAPTER I

Checking and Editing

STAGES IN DATA ANALYSIS

prepared by the researcher for assigning codes to the answers received.

cumulative distributions. In social research, frequency distributions are most

Frequency distribution: it presents the frequency of occurrence of

ungrouped and grouped. In ungrouped form, the scores are not

presented together as a group.

cases. This appears both in grouped and ungrouped forms.

is nothing like statistical sophistication in tabulation. It amounts to no more

frequencies, percentages and averages), tabulation is not only total adding

respondents and requiring cross tabulation

involving more than two variables, hand tabulation will be inappropriate,

Data Analysis and Interpretation

Categories are set up according to the research problem and purpose of

Q.2:Enumerate in detail transcription and graphical representation.

Tabular Presentation Methods

midpoint (column 3).

Frequently, it is of interest to discuss data in terms of percentages or proportion of AUs lying

The attribute of interest in

Graphic presentation methods

Sample size is a longitudinal study at three times of measurement

Q.3:What is analysis of data? Discuss in detail the purpose,

Getting ready for analysis.

According to Johan Galtung, The two phases of research operations are:

Purpose of Data Analysis

Characteristics of Analysis of Data

drawn from them the aim of a mature science.

data is almost completely a mechanical procedure.

means of new prescribed set of rules.

conclusions really mean.

expressed in the findings and why such relationships exists.

a column is usually identified by a name;

a column name can consist of a word, phrase or a numerical index;

the intersection of a row and a column is a cell.

Significance of Data Processing

Role of Computer in Analysis and Processing

Need of Statistical Technique in Research

Testing whether more men than women suffer from nightmares

Establishing authorship of documents

Evaluating the effect of the full moon on behavior

Determining the range at which a bat can detect an insect by echo

Deciding whether hospital carpeting results in more infections

Selecting the best means to stop smoking

Checking whether bumper stickers reflect car owner behavior

Testing the claims of handwriting analysts

Potrebbero piacerti anche