Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
VERA TOEPOEL
MARCEL DAS
ARTHUR VAN SOEST
Tilburg University
This article analyzes the effects of an experimental manipulation of the number of items
per screen in a Web survey with forty questions aimed at measuring arousal. The authors
consider effects on survey answers, item nonresponse, interview length, and the respond-
ents’ evaluation of several aspects of the survey (such as layout). Four different formats
are used, with one, four, ten, and forty items and headers on a screen. The authors find
no effect of format on the arousal index, but nonresponse increases with the number of
items appearing on a single screen. Having multiple items on a screen shortens the dura-
tion of the interview but negatively influences the respondent’s evaluation of the ques-
tionnaire layout. Grouping effects are generally similar for different demographic
groups, though there are some differences in magnitude and significance level.
INTRODUCTION
This article discusses the impact of one feature of Web survey design:
the number of items presented on each single screen. Some existing stud-
ies are described in the next section. A questionnaire of forty items aimed at
constructing an arousal scale is presented to respondents in a Web survey
that is broadly representative of the Dutch population. The experimental
design is described in the third section. The format is randomized: Each
respondent is presented one, four, ten, or all forty questions on one screen.
We then analyze the effect of the questionnaire format on the mean of the
forty answers (the arousal score) and the variance (effects on the outcome
variables), item nonresponse (effects on item nonresponse), the time the
respondent needs to complete the questionnaire (effects on the time needed
to complete the interview), and the respondent’s evaluation of the question-
naire (effects on the evaluation of the questionnaire). In addition, we inves-
tigate if any effects are tied to personal characteristics (effects for different
demographic subgroups), an issue at the frontier of Web survey methodol-
ogy (Dillman 2007; Stern, Dillman, and Smyth 2007).
BACKGROUND
also address this issue. The law of proximity (placing objects closely
together will cause them to be perceived as a group), but also the laws of
similarity (objects sharing the same visual properties will be grouped
together) and common region (elements within a single closed region will
be grouped together) suggest that grouping items on one page affects the
answering process (Dillman 2007).
Items are more likely to be seen as related if grouped on one screen,
reflecting a natural assumption that blocks of questions bear on related
issues, much as they would during ordinary conversations (Schwarz and
Sudman 1996; Sudman, Bradburn, and Schwarz 1996). Couper, Traugott,
and Lamias (2001) concluded that correlations are consistently higher
among items appearing together on a screen than among items separated
across several screens, but the effect is small, and differences between pairs
of correlations are insignificant. Tourangeau, Couper, and Conrad (2004)
replicated the above findings and found significant differences between
correlations. They concluded that respondents apparently use the proximity
of the items as a cue to their meaning, perhaps at the expense of reading
each item carefully. Peytchev et al. (2006) found no significant differences
between measurements using a paging and a scrolling design. Bradlow and
Fitzsimons (2001) found that when items are not labeled or clustered,
respondents base their responses on the previous item to a greater degree,
regardless of whether the items are intrinsically related.
According to Bowker and Dillman (2000), the questionnaire format may
increase partial nonresponse if respondents dislike it so much that they fail to
complete the survey. Lozar Manfreda, Batagelj, and Vehovar (2002) find no
evidence of differences in partial nonresponse between one- and multiple-
page designs, but they do find that a one-page design results in higher item
nonresponse.
Interview duration may affect response rates as well as the willingness
to cooperate in (future) surveys (Deutskens et al. 2004). Couper, Traugott,
and Lamias (2001); Lozar Manfreda, Batagelj, and Vehovar (2002); and
Tourangeau, Couper, and Conrad (2004) all find evidence that a multiple-
items-per-screen design takes less time to complete than a one-item-per-
screen design. Moreover, Deutskens (2006) finds that respondents who are
highly motivated have a higher willingness to participate in (follow-up)
surveys and that one of the strongest factors driving motivation is how
much the respondent enjoyed answering the questions. Particularly in panel
surveys where reducing attrition is an important concern, this makes it
worthwhile to analyze the effect of questionnaire format on how the
respondent evaluates the attractiveness of the survey.
Toepoel et al. / DESIGN OF WEB QUESTIONNAIRES 203
TABLE 1
Number of Respondents Who Completed the
Questionnaire for the Different Formats
group answered four items per screen (format 2, using less than half of
the screen); the third group answered ten items per screen (format 3,
using the whole screen). The fourth group answered all forty items on one
single screen (scrollable design, see Appendix, format 4).
Because of the height of the screen, people with forty items per screen
had to scroll to fill in all the items. Scrolling down made it impossible to
see the header (totally disagree–totally agree) when this was only shown at
the top of each screen. We formed three additional groups, with four, ten,
and forty items on one screen, but with a header displayed at each item
(resulting in four, ten, and forty headers per screen, as opposed to one
header per screen). For most of the analyses, we combined the single-
header- and multiple-header-per-screen formats,1 as we found few differ-
ences between them; we will discuss explicitly any differences that we did
find. We therefore speak of four different formats. Note that the exact
appearance of the questionnaire on the respondent’s screen depends on
screen settings (e.g., height and resolution). In particular, what respondents
with a Net.Box see is somewhat different from what others see. Still, we did
not find significant differences between Net.Box and other respondents.
The experiment was fielded in December 2004; 2,565 respondents (69%
of selected panel members) completed the questionnaire. See Table 1 for
the number of respondents in each format.
RESULTS
TABLE 2
Regression Results on Dummies for the Different Formats
(format 1 is taken as reference)
Coefficient p Value
NOTE: (A) Linear regression of the number of missing items; (B) Logit regression of the
dummy indicating zero (0) or more missing items (1).
the number of items per screen on the time it takes to complete the ques-
tionnaire. We indeed found a significant and monotonically decreasing
effect (F = 4.20, p = .006): the duration was longest for the one-item-per-
screen format (median4 = 384 seconds), followed by the four-items-per-
screen format (median = 316 seconds), the ten-items-per-screen format
(median = 303 seconds), and the all-items-per-screen format (median =
302 seconds).
TABLE 3
Effects of Grouping Items on a Screen per Demographic Group
NOTE: M1 = mean score for the 1-item-per-screen format; M4 = mean score for the
4-items-per-screen format; M10 = mean score for the 10-items-per-screen format; M40 =
mean score for the all-items-per-screen format.
times than their younger counterparts, but there were no significant differ-
ences between formats for the two age groups we consider. The design
effects for the three education groups were not significant either.
magnitude and significance level. For example, grouping affects item non-
response for men, older respondents, and respondents with low education.
All in all, the optimal number of items per screen requires a trade-off:
More items per screen shorten survey time but reduce data quality (item
nonresponse) and respondent satisfaction (with potential consequences for
motivation and cooperation in future surveys). Because the negative effects
of more items per screen mainly arise when scrolling is required, we are
inclined to recommend placing four to ten items on a single screen, avoid-
ing the necessity to scroll.
APPENDIX
This appendix presents screen dumps for two formats as were used in the experiment.
FIGURE 1
Format 1: One Item per Screen
Toepoel et al. / DESIGN OF WEB QUESTIONNAIRES 211
FIGURE 2
Format 4: All Items per Screen (with the scroll
bar on the right-hand side)
NOTES
1. As a result of combining the header formats, the one-item-per-screen format has half the
number of respondents compared to the other formats.
2. Because respondents were randomly assigned to a question format after opening the
questionnaire, unit nonresponse is not taken into account. As only sixteen of the 2,565 respond-
ents who started did not finish the questionnaire, partial nonresponse is not analyzed either.
3. Respondents could proceed in the questionnaire without answering each question; there
were deliberately no warnings or other checks to reduce item nonresponse.
4. We present median scores because the distribution of response times is skewed.
5. For the ten-items-per-screen version with a single header, the need to scroll depended
on the screen resolution.
REFERENCES
Borgers, N., J. Hox, and D. Sikkel. 2004. Response effects in surveys on children and adoles-
cents: The effect of number of response options, negative wording, and neutral mid-point.
Quality & Quantity 38 (1): 17–33.
Bowker, D., and D. A. Dillman. 2000. An experimental evaluation of left and right oriented
screens for Web questionnaires. Paper presented at the 55th annual conference of the
American Association for Public Opinion Research, Portland, OR, May 18–21. http://survey.
sesrc.wsu.edu/dillman/papers/AAPORpaper00.pdf (accessed November 12, 2008).
212 FIELD METHODS
Bradlow, E. T., and G. J. Fitzsimons. 2001. Subscale distance and item clustering effects in
self-administered surveys: A new metric. Journal of Marketing Research 38 (2): 254–62.
Christian, L. M. 2003. The influence of visual layout on scalar questions in Web surveys.
Unpublished master’s thesis, Washington State University. http://survey.sesrc.wsu.edu/
dillman/papers.htm (accessed November 12, 2008).
Couper, M. P., M. W. Traugott, and M. J. Lamias. 2001. Web survey design and administration.
Public Opinion Quarterly 65 (2): 230–53.
Deutskens, E. C., K. De Ruyter, M. Wetzels, and P. Oosterveld. 2004. Response rate and
response quality of Internet-based surveys: An experimental study. Marketing Letters 15
(1): 21–36.
Deutskens, E. C. 2006. From paper-and-pencil to screen-and-keyboard: Studies on the effec-
tiveness of Internet-based marketing research. PhD diss., University of Maastricht, the
Netherlands.
Dillman, D. A. 2007. Mail and Internet surveys. The tailored design method. Hoboken, NJ: Wiley.
Dillman, D. A., S. Caldwell, and M. Gansemer. 2000. Visual design effects on item nonre-
sponse to a question about work satisfaction that precedes the Q-12 agree-disagree items.
Paper supported by the Gallup Organization and Washington State University, Pullman.
http://survey.sesrc.wsu.edu/dillman/papers.htm (accessed November 12, 2008).
Knauper, B., N. Schwarz, and D. Park. 2004. Frequency reports across age groups. Journal of
Official Statistics 20 (1): 91–96.
Krosnick, J. A., and D. F. Alwin. 1987. An evaluation of a cognitive theory of response-order
effects in survey measurement. Public Opinion Quarterly 51 (2): 201–19.
Lozar Manfreda, K., Z. Batagelj, and V. Vehovar. 2002. Design of Web survey questionnaires: Three
basic experiments. Journal of Computer-Mediated Communication 7 (3). http://www.ascusc
.org/jcmc/vol7/issue3/vehovar.html (accessed November 12, 2008).
Mehrabian, A., and J. Russell. 1974. An approach to environmental psychology. Cambridge,
MA: MIT Press.
Norman, K. L., Z. Friedman, K. Norman, and R. Stevenson. 2001. Navigational issues in the
design of on-line self-administered questionnaires. Behavior & Information Technology
20 (1): 37–45.
Peytchev, A., M. P. Couper, S. Esteban McCabe, and S. D. Crawford. 2006. Web survey
design. Paging versus scrolling. Public Opinion Quarterly 70 (4): 596–607.
Sanchez, Maria E. 1992. Effects of questionnaire design on the quality of survey data. Public
Opinion Quarterly 52 (2): 732–42.
Schonlau, M., R. D. Fricker, and M. N. Elliott. 2002. Conducting research surveys via e-mail
and the Web. Santa Monica, CA: RAND.
Schwarz, N., and S. Sudman. 1996. Answering questions. San Francisco: Jossey-Bass.
Smyth, J. D., D. A. Dillman, L. M. Christian, and M. J. Stern. 2006. Comparing check-all and
forced-choice question formats in Web surveys. Public Opinion Quarterly 70 (1): 66–77.
Stern, M. J., D. A. Dillman, and J. D. Smyth. 2007. Visual design, order effects, and respond-
ent characteristics in a self-administered survey. Survey Research Methods 1 (3): 121–38.
Sudman, S., N. M. Bradburn, and N. Schwarz. 1996. Thinking about answers. San Francisco:
Jossey-Bass.
Tourangeau, R., M. P. Couper, and F. Conrad. 2004. Spacing, position, and order. Interpretive
heuristics for visual features of survey questions. Public Opinion Quarterly 68 (3):
368–93.
Tourangeau, R., M. P. Couper, and F. Conrad. 2007. Color, labels, and interpretive heuristics
for response scales. Public Opinion Quarterly 71 (1): 91–112.
Toepoel et al. / DESIGN OF WEB QUESTIONNAIRES 213