Sei sulla pagina 1di 10

Chapter- Seven

Sampling and Non-Sampling Errors


Two major types of error can arise when a sample of observations is taken from a population: Sampling error and
non-sampling error.
Sampling error refers to differences between the sample and the population that exist only because of the
observations that happened to be selected for the sample.
Another way to look at this is: the difference in results for different samples (of the same size) is due to sampling
error:
E.g. two samples of size 10 of 1,000 households. If we happened to get the highest income level data points in our
first sample and all the lowest income levels in the second, this delta is due to sampling error.
Whenever a sample is drawn only that part of the population is measured, and is used to represent the entire
population. Hence, there must always be some error in the data, resulting from those members of the population
who were not measured. Error will be reduced as the sample size is increased, so that, if a census is performed (a
100 percent sample is a census), by definition there will be no sampling error
Increasing the sample size will reduce this type of error.
Non-sampling errors are more serious and are due to mistakes made in the acquisition of data or due to the sample
observations being selected improperly. There are three types of non-sampling errors:
Errors in data acquisition,
Non-response errors, and
Selection bias.
Note: increasing the sample size will not reduce this type of error.
Errors in data acquisition
It arises from the recording of incorrect responses, due to:
Incorrect measurements being taken because of faulty equipment,
Mistakes made during transcription from primary sources,
Inaccurate recording of data due to misinterpretation of terms, or
Inaccurate responses to questions concerning sensitive issues.
Non-response Error
It refers to error (or bias) introduced when responses are not obtained from some members of the sample, i.e. the
sample observations that are collected may not be representative of the target population.
As mentioned earlier, the Response Rate (i.e. the proportion of all people selected who complete the survey) is a
key survey parameter and helps in the understanding in the validity of the survey and sources of non-response error.
Selection Bias
It occurs when the sampling plan is such that some members of the target population cannot possibly be selected for
inclusion in the sample.
Basic sampling problems
1. Definition of the universe being studied
Three problems must be addressed in any sampling operation. The first one is the definition of the universe
being studied. The universe is the entire group of items the researchers wish to study and about which they
plan to generalize. Failure to define the universe appropriately, in accord with the study objectives, may
yield misleading results. Therefore much appropriate decision must be taken in the process of defining the
universe.

2.

Definition of the variable being studied. The second problem to consider is the definition of the variables
to be studied. For example, assume a firm wishes to determine whether Katmandu metropolitan area grocery
stores stock a particular brand of pickles. In this case only one variable is being studied, and it may be given
a strict definition- a store either has the brand in stock or it does not.
3. Sample Design
Sample design is the third problem that must be addressed in any sampling operation. Sample design
consists the three elements. They are; determining sampling units- (it consists the determination of list of
sampling units), selecting the sample (-choosing the methods of sampling) and estimating universe
characteristics from sample data.
A sample is a part of a whole to show what the rest is like. Sampling helps to determine the corresponding value
of the population and plays a vital role in marketing research.
Sampling (i.e. selecting a sub-set of a whole population) is often done for reasons of cost (its less expensive to
sample 1,000 television viewers than 100 million TV viewers) and practicality (e.g. performing a crash test on
every automobile produced is impractical).
In any case, the sampled population and the target population should be similar to one another.
Samples offer many benefits
Save costs: Less expensive to study the sample than the population.
Save time: Less time needed to study the sample than the population.
Accuracy: Since sampling is done with care and studies are conducted by skilled and qualified interviewers
the results are expected to be accurate.
Destructive nature of elements: For some elements, sampling is the way to test, since tests destroy the
element itself.
Limitations of Sampling
Demands more rigid control in undertaking sample operation.
Minority and smallness in number of sub-groups often render study to be suspected.
Accuracy level may be affected when data is subjected to weighing.

Sampling method
Probability Sampling
In probability sampling, every element in the target population or universe [sampling frame] has equal probability
of being chosen in the sample for the survey being conducted. Therefore it is scientific, operationally convenient
and simple in theory. The results from this sampling method may be generalized. There are different methods of
probability sampling but we focus our attention on these four methods:
Simple Random Sampling,
Stratified Random Sampling, and
Cluster Sampling.
Area sampling
1. Simple Random sampling
A simple random sample is a sample selected in such a way that every possible sample of the same size is
equally likely to be chosen.
Drawing three names from a hat containing all the names of the students in the class is an example of a
simple random sample: any group of three names is as equally likely as picking any other group of three
names. We can use the random number table to select the sample under simple random sampling.

Random, or probability sampling, gives each member of the target population a known and equal
probability of selection. The two basic procedures are:
1 the lottery method, e.g. picking numbers out of a hat or bag
2 the use of a table of random numbers.
2. Systematic sampling
Systematic sampling is a modification of random sampling. To arrive at a systematic sample we simply
calculate the desired sampling fraction, e.g. if there are 100 distributors of a particular product in which
we are interested and our budget allows us to sample say 20 of them then we divide 100 by 20 and get the
sampling fraction 5. Thereafter we go through our sampling frame selecting every 5th distributor. In the
purest sense this does not give rise to a true random sample since some systematic arrangement is used in
listing and not every distributor has a chance of being selected once the sampling fraction is calculated.
However, because there is no conscious control of precisely which distributors are selected, all but the
most pedantic of practitioners would treat a systematic sample as though it were a true random sample.
Figure 7.2 Systematic sampling as applied to a survey of retailers

Systematic sampling
Population = 100 Food Stores
Sample desired = 20 Food Stores
a. Draw a random number 1-5.
b. Sample every Xth store.
Sample

Numbered Stores

1,

6,

11,

16,

21...

96

7,

12

17,

22...

97

3,

8,

13

18,

23...

98

4,

9,

14

19,

24...

99

5,

10,

15,

20,

25...

100

3. Stratified samples
Stratification increases precision without increasing sample size. Stratification does not imply any
departure from the principles of randomness it merely denotes that before any selection takes place, the
population is divided into a number of strata, then random samples taken within each stratum. It is only
possible to do this if the distribution of the population with respect to a particular factor is known, and if

it is also known to which stratum each member of the population belongs. Examples of characteristics
which could be used in marketing to stratify a population include: income, age, sex, race, geographical
region, possession of a particular commodity.
Stratification can occur after selection of individuals, e.g. if one wanted to stratify a sample of individuals
in a town by age, one could easily get figures of the age distribution, but if there is no general population
list showing the age distribution, prior stratification would not be possible. What might have to be done
in this case at the analysis stage is to correct proportional representation. Weighting can easily destroy the
assumptions one is able to make when interpreting data gathered from a random sample and so
stratification prior to selection is advisable. Random stratified sampling is more precise and more
convenient than simple random sampling.
When stratified sampling designs are to be employed, there are 3 key questions, which have to be
immediately addressed:

The bases of stratification, i.e. what characteristics should be used to subdivide the universe/population into
strata?
The number of strata, i.e. how many strata should be constructed and what stratum boundaries should be
used?
Sample sizes within strata, i.e. how many observations should be taken in each stratum?

4. Cluster and multistage sampling


Cluster sampling: The process of sampling complete groups or units is called cluster sampling,
situations where there is any sub-sampling within the clusters chosen at the first stage are covered by the
term multistage sampling. For example, suppose that a survey is to be done in a large town and that the
unit of inquiry (i.e. the unit from which data are to be gathered) is the individual household. Suppose
further that the town contains 20,000 households, all of them listed on convenient records, and that a
sample of 200 households is to be selected. One approach would be to pick the 200 by some random
method. However, this would spread the sample over the whole town, with consequent high fieldwork
costs and much inconvenience. (All the more so if the survey were to be conducted in rural areas,
especially in developing countries where rural areas are sparsely populated and access difficult). One
might decide therefore to concentrate the sample in a few parts of the town and it may be assumed for
simplicity that the town is divided into 400 areas with 50 households in each. A simple course would be
to select say 4 areas at random (i.e. 1 in 100) and include all the households within these areas in our
sample. The overall probability of selection is unchanged, but by selecting clusters of households, one
has materially simplified and made cheaper the fieldwork.
A large number of small clusters is better, all other things being equal, than a small number of large
clusters. Whether single stage cluster sampling proves to be as statistically efficient as a simple random
sampling depends upon the degree of homogeneity within clusters. If respondents within clusters are
homogeneous with respect to such things as income, socio-economic class etc., they do not fully
represent the population and will, therefore, provide larger standard errors. On the other hand, the lower
cost of cluster sampling often outweighs the disadvantages of statistical inefficiency. In short, cluster
sampling tends to offer greater reliability for a given cost rather than greater reliability for a given sample
size.
Multistage sampling: The population is regarded as being composed of a number of first stage or
primary sampling units (PSU's) each of them being made up of a number of second stage units in each

selected PSU and so the procedure continues down to the final sampling unit, with the sampling ideally
being random at each stage.
The necessity of multistage sampling is easily established. PSU's for national surveys are often
administrative districts, urban districts or parliamentary constituencies. Within the selected PSU one may
go direct to the final sampling units, such as individuals, households or addresses, in which case we have
a two-stage sample. It would be more usual to introduce intermediate sampling stages, i.e. administrative
districts are sub-divided into wards, then polling districts.
Area sampling
Area sampling is basically multistage sampling in which maps, rather than lists or registers, serve as the
sampling frame. This is the main method of sampling in developing countries where adequate population
lists are rare. The area to be covered is divided into a number of smaller sub-areas from which a sample is
selected at random within these areas; either a complete enumeration is taken or a further sub-sample.
Non-Probability Sampling
1. Convenience sampling; the researcher selects the sample units according to his /her convenience or sample
is chosen purely for expedience (e.g., items are selected because they are easy or cheap to find and measure)
an extreme example is monitoring price trends in a nearby grocery store, with objective of inferring national
price movements. Convenience sampling is sometimes useful in marketing research for some specialized
purposes. If one has very little information about a subject, then a small-scale convenience sample can be of
value in exploratory work, to help understand the range of variability of responses in subject area. Just
talking to a few consumers may help identify issues.
2. Quota sampling; This method is used when population of pertinent properties have to be studied, a fixed
proportion of sample from each class is fixed and fixed quota is assigned. As in stratified random sampling,
the researcher begins by constructing strata. Bases for stratification in consumers surveys are commonly
demographic, e.g., age, sex, income and so on. Often compound stratification is used- for example, age
groups within sex. Next sample sizes are established for each stratum. The sampling within strata may be
proportional or disproportional. Fieldworkers are then instructed to conduct interviews with the designated
quotas, with the identification of individual respondents being left to the field workers. Owing to its relative
economy and speed of execution, quota sampling will continue to enjoy wide usage. Because it uses the
principle of stratification, this method is likely to be superior to ordinary convenience sampling or
judgmental sampling.
3. Judgmental sampling; this method is used when the possible error is not so serious and the probability
sampling does not seem to be possible. Under this method, the approach taken is selection of universe items
by means of expert judgment. Using this approach, specialists in the subject matter of the survey choose
what they believed to be the best sample for that particular study. This approach has been found empirically
to produce unsatisfactory results. And of course, there is no objective way of evaluating the precision of
sample results. Despite these limitation, this method may be useful when the total sample size is extremely
small.

Fieldwork procedures
The gathering of data by individuals, such as, personal or telephone interviewers and observers is, broadly
speaking, referred to as fiedwork.The selection of fieldworkers with adequate training and experience is
essential for the success of the research project.A research study is only as good as the data input. There are
organizations to whom the task of gathering data and information can be outsourced, which are refffered to
asfield interviewing services Such organizations gather data (e.g. through personal and telephone
interviews or observation) for a fee, and they may also offer training and supervision services and selecting
fieldworkers .It is important for fieldworkers to have certain characteristics when gathering data, for e.g.,
being well-dressed and well-groomed, pleasant in disposition, outgoing, keen to interact with strangers etc.

Fieldwork involves the selection, training, supervision and evaluation of individuals who collect data in the
field. Data collection may be by interview or observation. But since the problems are greater in the interview
process, the following discussion is primarily is in terms of interviewing by telephone and in person.
Telephone interviewing
When telephone interview is assisted with computer, paper questionnaire is replaced with a video screen.
The questionnaire is entered into the computer in such a way that the question comes in the screen in the
proper sequence. Interviewers read the question and either type in the answers or use light pens to mark the
answers in the video screen. This procedure has advantage of controlling the questionnaire and having the
data entered in the computer directly so that at any time the results can be summarized quickly. Interviewers
are needed training at the telephone interviewing site. Training in this case may involve instruction on using
the computer. A supervisor can observe the work of individual interviewers on a master screen that shows
what is going on a given interviewers screen while what is being said comes over on audio monitor.
Personal interviewing
When data are collected personally at more dispread locations, field work procedures become the more
difficult.
For this purpose, the first step of research project starts with selection of fieldworkers. Since data are to be
collected from much different geographic location, one or more field workers with desired characteristics
must be found on each of these places. Most organizations keep a file of such workers by geographic
locations. The recruitment of field worker may be done with the help of local newspapers and educational
instructions.
In second step, training should be given to the fieldworkers. Briefing Session provides all interviewers
with identical background information about the project.Training Interview is a practice session in which an
inexperienced fieldworker records answers on a questionnaire to develop his or her skills, and clarify the
requirements of the research project in question.The purpose of the briefing session and training interview is
to ensure that data is gathered by the fieldworkers in a uniform manner. Major themes of training field
workers areas follows;
1) Making initial contact with the respondent and securing the interview
2) Asking survey questions
3) Probing
4) Recording Responses
5) Terminating the interview
Making initial contact
Polite introduction
Try to convince the person being interviewed that cooperation is important
Use of ID cards
Avoid using certain words which give the respondent a means of quickly ending the interview
(e.g.: may I ...)
Do not be too aggressive
Asking survey questions
Questions must be asked in a manner that avoids interviewer bias
Asking questions must take 5 major principles into consideration, i.e.:
ask the questions exactly as they are worded in the questionnaire,
read each question very slowly,
ask the questions in the order in which they are presented in the questionnaire,
ask every question specified in the questionnaire and
repeat questions that are easily misunderstood or misinterpreted
Probing

Probing is about verbal prompts by the fieldworker or interviewer when the respondent must be motivated
to communicate his or her answer or to enlarge on, clarify, or explain an answer
Interviewers must try to ensure that answers provided by the respondents are complete and unambiguous
and they must ensure that the respondent does not lose track of the questions they are asked
Interviewers must ensure that they do not influence the respondents when probing (neutrality)
Recording responses
All fieldworkers should use the same procedure for recording responses
In case of open-ended questions, responses should be recorded in the respondents own words (no
summarizing or paraphrasing) and should include an indication as to whether probing was required
Terminating the interview
Interview should not be terminated before all the information has been gathered from the respondent
Hasty departure by the interviewer should be avoided because sometimes after formal interviewing, the
respondents may offer valuable information. Moreover, a hasty departure is considered impolite and the
respondent may have to be reinterviewed
Interviewers should have the courtesy to answer questions which the respondent may have after the
interview
Third step of field work procedure is the supervising the field workers. Supervision ensures that the field
works are proceeding on schedule and that their work is satisfactory as per guidelines. On the basis of
supervision, directors need to keep tight control on the day to day field operation and must be ready to
replace individual interviewers quickly, if necessary.
In the fourth step, verification check is made to be certain that the interview were actually made so as to
ensure the interviewers did not cheat. The questionnaire or their data-forms turned in are checked for
completeness, compliance with instructions and apparent ability of the worker to obtain useful data.
Common sources of errors in field works
Non-observation errors: failure to obtain data from parts of the survey population (sometimes referred to as
sample selection error)
Are there segments of your populations from which you are failing to include in the sample? This is
a non-coverage error. Are some units over-represented (listed more than once)? This is overcoverage error. They are both gaps in the sample frame.
Are there characteristics members of the original sample that were not among the survey
respondents? This is a non-response error.
Data Processing Error
errors from entering, coding, editing
Check each others work
Ideally, have one person read and the other enter.
Interviewer Error/Cheating
Can occur from marking the wrong response, or falsifying responses
Check for falsification by calling a selected number of respondents back to confirm whether the
original interview was conducted
Sample Selection Error
How big of a problem is it?
Compare data with available population data - does the sample compare with other known data?
Be sure to scale questions similar to outside sources of known population data.
Can we do anything to reduce it?
Elimination of duplicates
Carefully comparing the sample frame to the population.
Non-response error: an inability to obtain information from sample elements.
Just because someone doesnt respond, it doesnt mean we have a non-response error.
If non-respondents are similar to respondents, there is no non-response error.

Not at homes: If there is no answer at a number, or if you get an answering machine, place the subject on a
callback list to call during the next calling cycle. Three attempted callbacks is usually adequate - that
reduces non-response bias issues.
Refusals: best strategy is to prevent them - what can you do to get the respondent to complete the
interview?
Observation Errors: inaccurate information is obtained, created during processing, or communicated in the results.
Harder to determine: we need to compare measure results with true information to detect - in many
cases, if we had the true information, we wouldnt need to do the research in the first place!
Response bias Occurs when respondents consciously or subconsciously distort their responses: Common types of
response bias:
Acquiescence Bias: some respondents tend to agree with all questions
Extremity Bias: Some respondents avoid extremes, others gravitate towards them.
Interviewer Bias: Altering a question, presence of the interviewer tone of voice
Auspices Bias: Who is conducting the research?
Social Desirability: Desire of respondent to create a favorable impression (voting, readership, TV)
Intentional bias
Anonymity (name not associated with answers)
Confidentiality (answers will remain private)
Incentives (encourages participation, creates sense of obligation to tell the truth)
Validation check (used in personal interview - asking respondent to demonstrate truthfulness, e.g.,
can I see your medication you take for condition X)
Third-person technique (makes the question less personal, e.g., Do you think someone like yourself
would consider a physicians
Controlling response bias
Respondent Misunderstanding: respondent provides an answer w/o understanding the question (watch
your terminology and instructions!)
Guessing: respondent gives an answer w/o knowing if it is accurate (introduces error - only ask questions
the respondent can answer)
Attention Loss: decreasing motivation to complete the survey (encourage them in the instructions) or
possible distractions (quiet place for interviews)
Respondent Fatigue: respondent tires of answer questions, accuracy drops (keep it short, offer incentive to
complete)
Distractions: respondent gives incorrect responses due to disruptions (look for a quiet, uninterrupted place
to administer)
Minimizing fieldwork errors
The following five factors help to improve the overall quality of field work holding cost at acceptable levels
1. Selection and training of fieldworkers
2. Administrative control of the projects in the field
3. Supervision of the field workers and data collection process
4. Quality and cost control process
5. Validation of fieldworks

Data Collection in the Field, Non-response Error, and Questionnaire Screening


Non-sampling error includes
All types of non-response error

Data gathering errors


Data handling errors
Data analysis errors
Interpretation errors
Possible Errors in Field Data Collection
Field worker error: errors committed by the persons who administer the questionnaires
Respondent error: errors committed on the part of the respondent
Errors may be either intentional or unintentional.
Data Collection Errors

Intentional field worker error: errors committed when a data collection person willfully violates the data
collection requirements set forth by the researcher
o Interviewer cheating occurs when the interviewer intentionally misrepresents respondents
o Leading respondents occurs when the interviewer influences respondents answers through wording,
voice inflection, or body language

Unintentional field worker error: errors committed when an interviewer believes he or she is performing correctly
o Interviewer personal characteristics occurs because of the interviewers personal characteristics such
as accent, sex, and demeanor
o
o

Interviewer misunderstanding occurs when the interviewer believes he or she knows how to
administer a survey but instead does it incorrectly
Fatigue-related mistakes occur when interviewer becomes tired

Intentional respondent error: errors committed when there are respondents that willfully misrepresent themselves in
surveys
o Falsehoods occur when respondents fail to tell the truth in surveys
o Nonresponse occurs when the prospective respondent fails to take part in a survey or to answer
specific questions on the survey
Unintentional respondent error: errors committed when a respondent gives a response that is not valid but that he
or she believes is the truth
o Respondent misunderstanding occurs when a respondent gives an answer without comprehending
the question and/or the accompanying instructions
o Guessing occurs when a respondent gives an answer when he or she is uncertain of its accuracy
o Attention loss occurs when a respondents interest in the survey wanes
o Distractions (such as interruptions) may occur while questionnaire administration takes place
o Fatigue occurs when a respondent becomes tired of participating in a survey
Non-response Error
Non-response: failure on the part of a prospective respondent to take part in a survey or to answer specific
questions on the survey
o Refusals to participate in survey
o Break-offs during the interview
o Refusals to answer certain questions (item omissions)
Completed interview must be defined

Response rate enumerates the percentage of the total sample with which the interviews were completed

Reducing Non-response Error

Mail surveys:
o Advance notification
o Monetary incentives
o Follow-up mailings
Telephone surveys:
o Callback attempts
Preliminary Questionnaire Screening

Unsystematic and systematic checks of completed questionnaires


What to look for in questionnaire inspection
Incomplete questionnaires
Non-responses to specific questions (item omissions)
Yea- or nay-saying patterns
Middle-of-the-road patterns

Unreliable responses are found when conducting questionnaire screening, and an inconsistent or unreliable
respondent may need to be eliminated from the sample.

Minimizing Non-Sampling Error

Cannot eliminate and cannot measure (except for non-response error)


Implement CONTROLS to minimize error:
o Close supervision of data collectors
o Training

Minimizing Non-Sampling Error


o
o
o

Care in constructing questionnaire and instructionspretest!


Provide incentives to respondents
VALIDATIONindustry standard is 10%

Potrebbero piacerti anche