Data Edit Code

Learning Objectives
1. Explain the concepts of editing and coding

2. List the important considerations in editing and
coding
3. List and explain the key issues in error-checking
and data transformation
4. Explain the contents and uses of a code book
5. Edit and code completed questionnaires
PROCESSING OF DATA
The collected data in research is processed and analyzed to
come to some conclusions or to verify the hypothesis
made.
Processing of data is important as it makes further analysis of
data easier and efficient. Processing of data technically
means
1. Editing of the data
2. Coding of data
3. Classification of data
4. Tabulation of data.

Overview of the Stages of Data Analysis
EDITING
The process of checking and adjusting
responses in the completed questionnaires
for omissions, legibility, and consistency and
readying them for coding and storage

Types of Editing
1. Field Editing
Preliminary editing by a field supervisor on the
same day as the interview to catch technical
omissions, check legibility of handwriting, and
clarify responses that are logically or
conceptually inconsistent.
2. In-house Editing or central
Editing performed by a central office staff; often dome more
rigorously than field editing
1. Backtracking
2. Allocating missing values
3. Plug values

Purpose of Editing
1. For consistency between and among
responses
2. For completeness in responses to reduce
effects of item non-response
3. To better utilize questions answered out of
order
4. To facilitate the coding process
Editing for Completeness
Item Nonresponse
The technical term for an unanswered question on an
otherwise complete questionnaire resulting in missing
data.
Plug Value
An answer that an editor plugs in to replace blanks or
missing values so as to permit data analysis; choice of
value is based on a predetermined decision rule.
Impute
To fill in a missing data point through the use of a
statistical algorithm that provides a best guess for the
missing response based on available information.
Facilitating the Coding
Process
Data Clean-up
Checking written responses for any stray marks
Editing And Tabulating Dont Know Answers
Legitimate dont know (no opinion)
Reluctant dont know (refusal to answer)
Confused dont know (does not understand)

Editing (contd)
Pitfalls of Editing
Allowing subjectivity to enter into the editing process.
Data editors should be intelligent, experienced, and
objective.
Failing to have a systematic procedure for assessing the
questionnaires developed by the research analyst
An editor should have clearly defined decision rules to
follow.
Pretesting Edit
Editing during the pretest stage can prove very valuable for
improving questionnaire format, identifying poor instructions
or inappropriate question wording.
CODING
The process of identifying and classifying each
answer with a numerical score or other
character symbol given by respondent
The numerical score or symbol is called a code,
and serves as a rule for interpreting, classifying,
and recording data
Identifying responses with codes is necessary if
data is to be processed by computer
Coding - Continued
Coded data is often stored electronically in the form of a
data matrix - a rectangular arrangement of the data into
rows (representing cases) and columns (representing
variables)

The data matrix is organized into fields, records, and files:
Field: A collection of characters that represents a single
type of data
Record: A collection of related fields, i.e., fields related to
the same case (or respondent)
File: A collection of related records, i.e. records related to
the same sample

Coding
Codebook formulation it is the formal
standardization for all the variables under study.
While designing we must take care of :
Appropriateness to the research objective
Comprehensive options
Mutually exclusive options
Single variable entry

Coding
Coding of closed ended structured questions
Dichotomous questions
Ranking questions
Checklist /multiple responses
Scaled questions
Missing values
Key Issues in Coding
1. Pre coding and post -coding
2. Pre-Coding Fixed-Alternative Questions (FAQs)
-Writing codes for FAQs on the questionnaire
before the data collection
3. Coding Open-Ended Questions - A 3-stage
process:
(a) Perform a test tabulation, (b) Devise a coding
scheme, (c) Code all responses
Two Rules For Code Construction are:
a) Coding categories should be exhaustive
b) Coding categories should be mutually exclusive and
independent
Issues in Coding - Continued
3. Maintaining a Code Book - A book that
identifies each variable in a study, the variables
description, code name, and position in the data
matrix
4. Production Coding - The physical activity of
transferring the data from the questionnaire or data
collection form [to the computer] after the data has
been collected. Sometimes done through a coding
sheet ruled paper drawn to mimic the data matrix
5. Combining Editing and Coding

AFTER CODING ..
1. Data Entry - The transfer of codes from
questionnaires (or coding sheets) to a computer.
Often accomplished in one of three ways:
a) On-line direct data entry
b) Optical scanning for highly structured
questionnaires
c) Keyboarding data entry via a computer
keyboard; often requires verification

After Coding - Continued
2. Error Checking Verifying the accuracy of
data entry and checking for some kinds of
obvious errors made during the data entry.
Often accomplished through frequency
analysis.

After Coding - Continued
3. Data Transformation Converting some of the
data from the format in which they were entered to
a format most suitable for particular statistical
analysis.
Often accomplished through re-coding, to:
reverse-score negative (or positive) statements
into positive (or negative) statements;
collapse the number of categories of a variable

EDITING REQUIRES SOME CAREFUL
CONSIDERATIONS:
Editor must be familiar with the interviewers mind set, objectives
and everything related to the study.
Different colors should be used when editors make entry in the
data collected.
They should initial all answers or changes they make to the data.
The editors name and date of editing should be placed on the
data sheet.
CODING:
Classification of responses may be done on the basis of
one or more common concepts.
In coding a particular numeral or symbol is assigned to the
answers in order to put the responses in some definite
categories or classes.
The classes of responses determined by the researcher
should be appropriate and suitable to the study.
Coding enables efficient and effective analysis as the
responses are categorized into meaningful classes.
Coding decisions are considered while developing or
designing the questionnaire or any other data collection tool.
Coding can be done manually or through computer.
CLASSIFICATION:
Classification of the data implies that the collected raw
data is categorized into common group having common
feature.
Data having common characteristics are placed in a
common group.
The entire data collected is categorized into various groups
or classes, which convey a meaning to the researcher.
Classification is done in two ways:
1. Classification according to attributes.
2. Classification according to the class intervals.
CLASSIFICATION ACCORDING THE THE ATTRIBUTES:
Here the data is classified on the basis of common
characteristics that can be descriptive like literacy, sex,
honesty, marital status e.t.c. or numeral like weight, height,
income e.t.c.
Descriptive features are qualitative in nature and cannot be
measured quantitatively but are kindly considered while
making an analysis.
Analysis used for such classified data is known as statistics
of attributes and the classification is known as the
classification according to the attributes.

CLASSIFICATION ON THE BASIS OF THE INTERVAL:
The numerical feature of data can be measured quantitatively
and analyzed with the help of some statistical unit like the
data relating to income, production, age, weight e.t.c.
come under this category. This type of data is known as
statistics of variables and the data is classified by way of
intervals.
CLASSIFICATION ACCORDING TO THE CLASS
INTERVAL USUALLY INVOLVES THE FOLLOWING
THREE MAIN PROBLEMS:
1. Number of Classes.
2. How to select class limits.
3. How to determine the frequency of each class.
TABULATION:
The mass of data collected has to be arranged in some kind of
concise and logical order.
Tabulation summarizes the raw data and displays data in form
of some statistical tables.
Tabulation is an orderly arrangement of data in rows and
columns.
OBJECTIVE OF TABULATION:
1. Conserves space & minimizes explanation and descriptive
statements.
2. Facilitates process of comparison and summarization.
3. Facilitates detection of errors and omissions.
4. Establish the basis of various statistical computations.
BASIC PRINCIPLES OF TABULATION:
1. Tables should be clear, concise & adequately titled.
2. Every table should be distinctly numbered for easy
reference.
3. Column headings & row headings of the table should be
clear & brief.
4. Units of measurement should be specified at appropriate
places.
5. Explanatory footnotes concerning the table should be
placed at appropriate places.
6. Source of information of data should be clearly indicated.

7. The columns & rows should be clearly separated with
dark lines
8. Demarcation should also be made between data of one
class and that of another.
9. Comparable data should be put side by side.
10. The figures in percentage should be approximated before
tabulation.
11. The alignment of the figures, symbols etc. should be
properly aligned and adequately spaced to enhance the
readability of the same.
12. Abbreviations should be avoided.

Post tabulation
Exploratory analysis
Statistical software packages
MS Excel
Minitab
System for statistical analysis (SAS)
Statistical package for social science (SPSS)
ANALYSIS OF DATA

The important statistical measures that are used to analyze
the research or the survey are:
1. Measures of central tendency(mean, median & mode)
2. Measures of dispersion(standard deviation, range, mean
deviation)
3. Measures of asymmetry(skew ness)
4. Measures of relationship etc.( correlation and regression)
5. Association in case of attributes.
6. Time series Analysis

TESTING THE HYPOTHESIS
Several factor are considered into the determination of the
appropriate statistical technique to use when conducting a
hypothesis tests. The most important are as:
1. The type of data being measured.
2. The purpose or the objective of the statistical inference.
Hypothesis can be tested by various techniques. The
hypothesis testing techniques are divided into two broad
categories:
1. Parametric Tests.
2. Non- Parametric Tests.
PARAMETRIC TESTS:

These tests depends upon assumptions typically that the
population(s) from which data are randomly sampled
have a normal distribution. Types of parametric tests are:

1. t- test
2. z- test
3. F- test
4. 2- test
NON PARAMETRIC TESTS
The various types of Non Parametric Tests are:
1. Wilcox on Signed Rank Test ( for comparing two
population)
2. Kolmogorov Smirnov Test( to test whether or not the
sample of data is consistent with a specified distribution
function)
3. Runs Tests (in studies where measurements are made
according to some well defined ordering, either in time or
space, a frequent question is whether or not the average
value of the measurement is different points in the
sequence. This test provides a means of testing this.
4. Sign Test (this is single sample test that can be used
instead of the single sample t- test or paired t- test.
5. Chi square test

INTERPRETATION:
Interpretation is the relationship amongst the collected data,
with analysis. Interpretation looks beyond the data of the
research and includes researches, theory and hypothesis.
Interpretation in a way act as a tool to explain the
observations of the researcher during the research period
and it acts as a guide for future researches.
WHY Interpretation?
-the researcher understands the abstract principle underlying
the findings.
-Interpretation links up the findings with those of other similar
studies.
-The researcher is able to make others understand the real
importance of his research findings.

PRECAUTIONS IN INTERPRETATION:
1. Researcher must ensure that the data is appropriate, trust
worthy and adequate for drawing inferences.
2. Researcher must be cautious about errors and take due
necessary actions if the error arises
3. Researcher must ensure the correctness of the data
analysis process whether the data is qualitative or
quantitative.
4. Researcher must try to bring out hidden facts and un
obvious factors and facts to the front and combine it with
the factual interpretation.
5. The researcher must also ensure that there should be
constant interaction between initial hypothesis, empirical
observations, and theoretical concepts.

Data Edit Code

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Data Edit Code

Caricato da

Copyright:

Formati disponibili

Learning Objectives

1. Explain the concepts of editing and coding

Potrebbero piacerti anche