Sei sulla pagina 1di 65

Lecture 7

Sampling Design
Sampling Approaches and
Considerations

Learning Objectives:
1. Understand the key principles in sampling.
2. Appreciate the difference between the target
population and the sampling frame.
3. Recognize the difference between probability
and non-probability sampling.
4. Describe the different sampling methods.
5. Determine the appropriate sample size.
11-3

Advantages of Sampling
„ Less costs
„ cheaper than studying whole population
„ Less errors due to less fatigue
„ better results
„ Less time
„ quicker
„ Destruction of elements avoided
„ eg bulbs
11-4

Sampling vs. Census ?


Go On-Line
www.surveysampling.com

A census involves collecting data from all


members of a population.

A sample is a relatively small subset of the


population that is selected to be representative
of the population’s characteristics.
11-5

Sample vs. Census


Table 11.1

Conditions Favoring the Use of


Type of Study Sample Census

1. Budget Small Large

2. Time available Short Long

3. Population size Large Small

4. Variance in the characteristic Small Large

5. Cost of sampling errors Low High

6. Cost of nonsampling errors High Low

7. Nature of measurement Destructive Nondestructive

8. Attention to individual cases Yes No


11-6

Relationship between Sample Statistics and Population


Parameters
11-7

Symbols for Population and Sample Variables


Table 12.1

Variable Population Sample


_
Mean µ X

Proportion ∏ p

Variance σ2 s2

Standard deviation σ s

Size N n
_
Standard error of the mean σx
_ Sx
Standard error of the proportion σp Sp
_
Standardized variate (z) (X-µ)/σ (X-X)/S
_
Coefficient of variation (C) σ/µ S/X
11-8

Sampling Design
Process

The sampling design process involves answering


three questions:
1. Should a sample or a census be used?
2. If a sample, then which sampling approach is
best?
3. How large a sample is necessary?
11-9

To obtain a representative sample ...

Steps to follow:

1. Define the target population.

2. Choose the sampling frame.

3. Select the sampling method.

4. Determine the sample size.

5. Implement the sampling plan.


11-10

Representative
Sample
A representative sample mirrors
the characteristics of the population
and minimizes the errors associated
with sampling.
11-11

Target Population

. . . the complete group of objects or


elements relevant to the research
project. They are relevant because
they possess the information the
research project is designed to
collect.
11-12

Sampling Unit

. . . . elements or objects available for


selection during the sampling process are
known as the sampling unit.
11-13

Sampling Frame

. . . . as complete a list as possible of


all the elements in the population from
which the sample is drawn.
11-14

Relationship between Population, Sampling


Frame and Sample
11-15

The sampling frame often is flawed


because . . .

It may not be up to date.


It may include elements that do not belong
to the target population.
It may not include elements that do belong
to the target population.
It may be compiled from multiple lists and
contain duplicate elements.
11-16

Population, Element, Sampling Frame, Sample


and Subject

„ Population (or target population)


„ entire group of people, events or things of interest that
the researcher wishes to investigate
„ Element
„ a single member of the population
„ Sampling Frame
„ a listing of all the elements in the population from which
the sample is drawn
„ Sample
„ a subset of the population
„ Subject
„ a single member of the sample
11-17

Sampling Methods

Go On-Line
www.svys.com

Probability

Non-Probability
11-18

Types of Sampling
Methods

Probability Non-Probability
Simple Random Convenience
Systematic Judgment
Stratified Snowball/Referral
Cluster Quota
Multi-Stage
11-19

Classification of Sampling Techniques


Fig. 11.2

Sampling Techniques

Nonprobability Probability
Sampling Techniques Sampling Techniques

Convenience Judgmental Quota Snowball


Sampling Sampling Sampling Sampling

Simple Random Systematic Stratified Cluster Other Sampling


Sampling Sampling Sampling Sampling Techniques
Probability vs. Non-Probability
11-20

Sampling
Go On-Line
www.surveysystem.com/sscalc.html

Probability = each element of the population has a


known, but not necessarily equal, probability of being
selected in a sample.

Non-Probability = not every element of the target


population has a chance of being selected because
the inclusion or exclusion of elements in a sample is
left to the discretion of the researcher.
11-21

Simple Random Sampling

. . . . a sampling method in which each


element of the population has an equal
probability of being selected.
11-22

Simple Random Sampling


„ Each element in the population has a known and
equal probability of selection.
„ Each possible sample of a given size (n) has a known
and equal probability of being the sample actually
selected.
„ This implies that every element is selected
independently of every other element.
11-23

Procedures for Drawing Probability Samples


Fig. 11.4

Simple Random
Sampling

1. Select a suitable sampling frame


2. Each element is assigned a number from 1 to N
(pop. size)
3. Generate n (sample size) different random numbers
between 1 and N
4. The numbers generated denote the elements that
should be included in the sample
11-24

Systematic Sampling

. . . a process that involves


randomly selecting an initial
starting point on a list, and
thereafter every nth element in
the sampling frame.
Procedures for Drawing
11-25

Probability Samples
Fig. 11.4 cont. Systematic
Sampling

1. Select a suitable sampling frame


2. Each element is assigned a number from 1 to N (pop. size)
3. Determine the sampling interval i:i=N/n. If i is a fraction,
round to the nearest integer
4. Select a random number, r, between 1 and i, as explained in
simple random sampling
5. The elements with the following numbers will comprise the
systematic random sample: r, r+i,r+2i,r+3i,r+4i,...,r+(n-1)i
11-26

Systematic Sampling
„ The sample is chosen by selecting a random starting point and
then picking every ith element in succession from the sampling
frame.
„ The sampling interval, i, is determined by dividing the
population size N by the sample size n and rounding to the
nearest integer.
„ When the ordering of the elements is related to the
characteristic of interest, systematic sampling increases the
representativeness of the sample.
„ If the ordering of the elements produces a cyclical pattern,
systematic sampling may decrease the representativeness of the
sample.
For example, there are 100,000 elements in the population and
a sample of 1,000 is desired. In this case the sampling interval,
i, is 100. A random number between 1 and 100 is selected. If,
for example, this number is 23, the sample consists of elements
23, 123, 223, 323, 423, 523, and so on.
11-27

Stratified Sampling

. . . requires the researcher to


partition the target population
into relatively homogeneous
subgroups that are distinct and
non-overlapping.
11-28

Stratified Sampling
„ A two-step process in which the population is
partitioned into subpopulations, or strata.
„ The strata should be mutually exclusive and
collectively exhaustive in that every population
element should be assigned to one and only one
stratum and no population elements should be
omitted.
„ Next, elements are selected from each stratum by a
random procedure, usually SRS.
„ A major objective of stratified sampling is to increase
precision without increasing cost.
11-29

Stratified Sampling
„ The elements within a stratum should be as homogeneous as
possible, but the elements in different strata should be as
heterogeneous as possible.
„ The stratification variables should also be closely related to the
characteristic of interest.
„ Finally, the variables should decrease the cost of the
stratification process by being easy to measure and apply.
„ In proportionate stratified sampling, the size of the sample
drawn from each stratum is proportionate to the relative size of
that stratum in the total population.
„ In disproportionate stratified sampling, the size of the sample
from each stratum is proportionate to the relative size of that
stratum and to the standard deviation of the distribution of the
characteristic of interest among all the elements in that stratum.
11-30

Two Types of Stratified


Sampling

Proportionate = the number of elements chosen


from each of the strata is proportionate to the size of
a particular strata relative to the overall sample size.

Disproportionate = the number of elements chosen


from each of the strata is not based on the size of the
stratum relative to the target population size, but
rather is based either on the importance of a
particular stratum or its variability.
11-31

Proportionate & Disproportionate Stratified Random


Sampling
Procedures for Drawing
11-32

Probability Samples
Fig. 11.4 cont. Stratified
Sampling

1. Select a suitable frame


2. Select the stratification variable(s) and the number of strata, H
3. Divide the entire population into H strata. Based on the
classification variable, each element of the population is assigned
to one of the H strata
4. In each stratum, number the elements from 1 to Nh (the pop.
size of stratum h)
5. Determine the sample size of each stratum, nh, based on
proportionate or disproportionate stratified sampling, where
H
nh = n
h=1
6. In each stratum, select a simple random sample of size nh
11-33

Cluster Sampling

. . . a form of probability
sampling in which the
relatively homogeneous
individual clusters where
sampling occurs are chosen
randomly and not all
clusters are sampled.
11-34

Cluster Sampling
„ The target population is first divided into mutually exclusive and
collectively exhaustive subpopulations, or clusters.
„ Then a random sample of clusters is selected, based on a
probability sampling technique such as SRS.
„ For each selected cluster, either all the elements are included in
the sample (one-stage) or a sample of elements is drawn
probabilistically (two-stage).
„ Elements within a cluster should be as heterogeneous as
possible, but clusters themselves should be as homogeneous as
possible. Ideally, each cluster should be a small-scale
representation of the population.
„ In probability proportionate to size sampling, the clusters
are sampled with probability proportional to size. In the second
stage, the probability of selecting a sampling unit in a selected
cluster varies inversely with the size of the cluster.
11-35

Types of Cluster Sampling


Fig. 11.3 Cluster Sampling

One-Stage Two-Stage Multistage


Sampling Sampling Sampling

Simple Cluster Probability


Sampling Proportionate
to Size Sampling
11-36

Multi-Stage Cluster
Sampling
Cluster sampling involves dividing the
population into clusters and randomly
selecting a pre-specified number of
clusters and then either collecting
information from all the elements in each
cluster or a random sample. With multi-
stage cluster sampling the same process
is completed two or more times.
Procedures for Drawing
11-37

Probability Samples Cluster


Fig. 11.4 cont. Sampling

1. Assign a number from 1 to N to each element in the population


2. Divide the population into C clusters of which c will be included in
the sample
3. Calculate the sampling interval i, i=N/c (round to nearest integer)
4. Select a random number r between 1 and i, as explained in simple
random sampling
5. Identify elements with the following numbers:
r,r+i,r+2i,... r+(c-1)i
6. Select the clusters that contain the identified elements
7. Select sampling units within each selected cluster based on SRS
or systematic sampling
8. Remove clusters exceeding sampling interval i. Calculate new
population size N*, number of clusters to be selected C*= C-1,
and new sampling interval i*.
11-38

Procedures for Drawing Probability Samples


Fig. 11.4 cont.
Cluster
Sampling

Repeat the process until each of the remaining


clusters has a population less than the
sampling interval. If b clusters have been
selected with certainty, select the remaining c-
b clusters according to steps 1 through 7. The
fraction of units to be sampled with certainty is
the overall sampling fraction = n/N. Thus, for
clusters selected with certainty, we would
select ns=(n/N)(N1+N2+...+Nb) units. The units
selected from clusters selected under PPS
sampling will therefore be n*=n- ns.
11-39

Convenience Sampling

. . . involves selecting sample


elements that are most readily
available to participate in the
study and who can provide
the required information.
11-40

Convenience Sampling
Convenience sampling attempts to obtain a
sample of convenient elements. Often, respondents
are selected because they happen to be in the right
place at the right time.

„ use of students, and members of social


organizations
„ mall intercept interviews without qualifying the
respondents
„ department stores using charge account lists
„ “people on the street” interviews
11-41

Judgment Sampling

. . . a form of convenience
sampling, sometimes referred to
as a purposive sample, in which
the researcher’s judgment is used
to select the sample elements.
11-42

Judgmental Sampling
Judgmental sampling is a form of convenience
sampling in which the population elements are
selected based on the judgment of the researcher.

„ test markets
„ purchase engineers selected in industrial
marketing research
„ bellwether precincts selected in voting behavior
research
„ expert witnesses used in court
11-43

Quota Sampling

. . . . similar to proportionately stratified


random sampling but the selection of
the elements from the strata is done on
a convenience basis.
11-44

Quota Sampling
Quota sampling may be viewed as two-stage restricted judgmental
sampling.
„ The first stage consists of developing control categories, or quotas,
of population elements.
„ In the second stage, sample elements are selected based on
convenience or judgment.

Population Sample
composition composition
Control
Characteristic Percentage Percentage Number
Sex
Male 48 48 480
Female 52 52 520
____ ____ ____
100 100 1000
11-45

Snowball Sampling

. . . also called a referral sample, the initial


respondents typically are chosen using
probability methods and these respondents
then identify others in the target population.
11-46

Snowball Sampling
In snowball sampling, an initial group of
respondents is selected, usually at random.

„ After being interviewed, these respondents are


asked to identify others who belong to the target
population of interest.
„ Subsequent respondents are selected based on
the referrals.
11-47

Choice Points in Sampling Design


Choosing Nonprobability vs.
11-48

Probability Sampling
Table 11.4 cont.
Conditions Favoring the Use of
Factors Nonprobability Probability
sampling sampling

Nature of research Exploratory Conclusive

Relative magnitude of sampling Nonsampling Sampling


and nonsampling errors errors are errors are
larger larger

Variability in the population Homogeneous Heterogeneous


(low) (high)

Statistical considerations Unfavorable Favorable

Operational considerations Favorable Unfavorable


Strengths and Weaknesses of
11-49

Basic Sampling Techniques


Table 11.3
Technique Strengths Weaknesses
Nonprobability Sampling Least expensive, least Selection bias, sample not
Convenience sampling time-consuming, most representative, not recommended for
convenient descriptive or causal research
Judgmental sampling Low cost, convenient, Does not allow generalization,
not time-consuming subjective
Quota sampling Sample can be controlled Selection bias, no assurance of
for certain characteristics representativeness
Snowball sampling Can estimate rare Time-consuming
characteristics

Probability sampling Easily understood, Difficult to construct sampling


Simple random sampling results projectable frame, expensive, lower precision,
(SRS) no assurance of representativeness.
Systematic sampling Can increase Can decrease representativeness
representativeness,
easier to implement than
SRS, sampling frame not
necessary
Stratified sampling Include all important Difficult to select relevant
subpopulations, stratification variables, not feasible to
precision stratify on many variables, expensive
Cluster sampling Easy to implement, cost Imprecise, difficult to compute and
effective interpret results
11-50

Determining sample size involves


achieving a balance between
several factors:

• The variability of elements in the target population.


• The type of sample required.
• Time available.
• Budget.
• Required estimation precision.
• Whether findings will be generalized.
11-51

Three decisions to make when


statistical formulas are used to
determine sample size:

1. The degree of confidence


(often 95%).
2. The specified level of precision
(amount of acceptable error).
3. The amount of variability
(population homogeneity).
11-52

Sampling
Approaches and
Considerations

Go On-Line
http://random.mat.sbg.ac.at/links

How would this website be useful to


business researchers?
11-53

Precision and Confidence


„ Precision
„ refers to how close the sample estimate eg X is
to the true population characteristic( µ) depends on
the variablity in the sampling distribution of the
mean, ie the standardS error ( S X )
X

„ indicates the confidence interval within which the


population mean can be estimated (µ= X + KS X )
„ Confidence
„ reflects the level of certainty that the sample
estimates will actually hold true for the population
„ bias is absent from the data
„ accuracy is reflected by the confidence level ( K )
11-54

Standard Error

SX = S
n
S = standard deviation of the sample
= sample size

SX = standard error or standard deviation


of the sample mean
11-55

Characteristics of the Standard Error


SX = S
n
„ The smaller the standard deviation of
the population, the smaller the standard
error and the greater the precision
„ The standard error varies inversely with
the square root of the sample size.
Hence the larger the n, the smaller the
standard error, and the greater the
precision.
11-56

Confidence Interval for the Mean

µ = X ± KS X
µ = population mean

X = sample mean

SX = standard error

K = z statistic for large samples ≥ 30


= t statistic for small samples < 30
11-57

Confidence Levels
µ = X ± KS X
„ For large samples, K = z score
= 1.65 for 90% confidence level
= 1.96 for 95% confidence level
= 2.58 for 99% confidence level

„ Example: a 95% confidence interval for mean


purchases (µ) by customers based on a sample
mean of $105 with a standard error of $1.43 is:
µ = 105 ± 1.96*1.43 = 105 ± 2.80
Hence µ would fall between $102.20 and
$107.80
11-58

Trade-off between Precision and Confidence


11-59

Determining the Sample Size

Example: Suppose a manager wants to be 95%


confident that withdrawals from a bank will be within
a confidence level of ± $500. From a sample of
customers the standard deviation S was calculated
as $3500. What sample size is needed?
11-60

µ = X ± KS X
The expression KS X is equivalent to the precision or
admissible margin of error. Let this be E.
E = KS X
or E = K *S
n
11-61

Determining the Sample Size (cont’d)


Rearranging these terms, a formula for the sample size n is:
2
⎛ K *S⎞
n=⎜ ⎟
⎝ E ⎠
Substituting K=1.96 (95% confidence), S=3500, and
E=500 into this equation, provides the sample size n:

( )
2
n= 1.96*3500
500

n = 13.72 2

n = 188
11-62

Sample Size for Estimating Multiple Parameters


Table 12.3

Variable
Mean Household Monthly Expense On
Department store shopping Clothes Gifts

Confidence level 95% 95% 95%

z value 1.96 1.96 1.96

Precision level (D) $5 $5 $4

Standard deviation of the $55 $40 $30


population (σ)

Required sample size (n) 465 246 217


Roscoe’s Rules of Thumb for Determining
11-63

Sample Size

„ Sample sizes larger than 30 and smaller than


500 are appropriate for most research
„ Minimum sample size of 30 for each sub-
category is usually necessary
„ In multivariate research, the sample size should
be several times as large as the number of
variables in the study
„ For simple experimental research, successful
research is possible with samples as small as 10
to 20
11-64

Efficiency in Sampling

If n is constant, you should get a smaller

SX
or

For the same S X , you should use a


smaller n
11-65

Review of Sample Size Decisions


„ How much precision is wanted in estimating the
population characteristics, ie what is the margin of
admissible error or confidence interval?
„ How much confidence is really needed. How much
risk can we take of making errors in estimating the
population parameters (ie confidence level)?
„ How much variability is in the population? The
greater the variability, the larger the sample size
needed.
„ Cost and time constraints
„ The size of the population (N) itself

Potrebbero piacerti anche