Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/305462834
Scale Development
CITATIONS READS
0 581
2 authors:
Some of the authors of this publication are also working on these related projects:
Cultural and Intellectual Openness Differentially Relate to Social Judgments of Potential Work Partners
View project
All content following this page was uploaded by Louis Tay on 20 July 2016.
Scale Development
Approaches to scale creation. There are two distinct approaches to scale creation. A
deductive approach focuses on using theory and the already-formed conceptualization of
construct to generate items within its domain. This approach is useful when the definition of the
construct is known and substantial enough to generate an initial pool of items. By contrast, an
inductive approach is useful when there is uncertainty in the definition or dimensionality of the
construct. In this case, organizational incumbents are asked to provide descriptions of the
concept and a conceptualization is then derived which then forms the basis for generating items.
A key idea in construct definition is to outline the nomological network: i.e., how the
focal construct (and its specific dimensions) is related to other constructs. Once the construct is
defined, one can begin to specify this nomological network, which entails stating what the
construct should be positively related to, negatively related to, and relatively independent of
based on theory. The nomological network will be essential to the validation process, as a scale
that empirically relates to other established measures in the way predicted by theory displays
important types of validity evidence (convergent and divergent validity).
Purposes of created scale. Before discussing the specific principles of item writing, it is
necessary to specify the purpose of the scale. Will the scale be used for research, selection,
development, or another purpose? Is the scale intended for the general population, the population
of adult workers, or another specific population? Outlining the scales purpose and use in future
contexts will allow one to identify the unique practical concerns related to the scale. This guides
item creation in a number of ways, such as (1) determining an appropriate reading level for the
target population; (2) identifying whether the items should refer to general or specific contexts
and situations (work contexts); (3) considering differences in how respondents interpret the items
(e.g., the different meaning of the term stress in different national contexts); (4) deciding the
type of scale response format and behavioral anchors, which can potentially affect scale
responses; and (4) determining the applicability of reverse scoring, which may not be appropriate
for positive constructs such as virtues.
Principles of item writing. When writing items, one aims to create an initial item pool
that contains many more items than in the final scale (e.g., 3-4 times larger than in the final
scale). This gives the researcher more freedom about the psychometric standard of the items that
survive to the final scale. The initial redundancy and over-inclusivity in the initial item pool is
also desirable because it can serve to uncover sub-dimensions or closely related but distinct
constructs. As for the actual writing of items, recommendations from a wide range of sources
agree on the following principles: items should be simple and straightforward; one should avoid
slang, jargon, double negatives, ambiguous words, overly abstract words and favor the use of
specific and concrete words; no double-barreled items (i.e., two different ideas included in a
single question); no leading questions or statements (e.g., Most supervisors are toxic. Please
respond to how aggressive your supervisor has been to you); and items should not be identical
re-statements but should seek to state the same idea in different ways. Finally, it is often helpful
to provide the construct definition, relevant adjectives, and example scale items to item writers
when generating items.
Regarding sampling, the preliminary sample size for examining psychometric properties
of items has been recommended to be 100-200 and a later confirmatory sample size with a
minimum of 300. However, this may depend on group differences and the type of analysis one
seeks to conduct. Based on its theoretical and practical context, one should also seek to match the
validation of the scale to its scale application. For instance, if a scale is meant for a work sample
for entrepreneurs, it will be important to obtain a sample from the same subpopulation of
interest. Notably, using a broader sample than the target subpopulation can artificially raise
reliability of the scale. A recommended best practice is to cross-validate the scale across
independent samples to show that scale properties are stable and generalizable.
Scale psychometric properties. After data collection, one needs to establish the reliability
and validity of the scale items. At the first step, it is critical to identify a good set of items with
reasonable psychometric properties. This is usually done by examining the mean, standard
deviations, score range, endorsement proportions across all the options, and the item-total
correlation for each item. One should select items that have reasonable item-total correlations
(around .20 or higher), appropriate score ranges (i.e., no ceiling or floor effects), and a utilization
of different scale options.
Based on the selected items, there are different approaches for calculating reliability, but
calculation of internal consistency is the most common. In general, the rule-of-thumb for internal
consistency reliability is a minimum .70 although it is recommended that .90 or higher for high-
stakes decisions (e.g., selection). One should also calculate the reliability on sub-dimensions of
the construct.
After establishing reliability and factorial validity, a researcher would continue providing
validation evidence by examining evidence based on the scale validation design. This may
include examining group differences on scale scores or divergent and convergent validity based
on with other related measures. This involves examining how the new construct empirically
relates to other constructs its nomological network, and this overall process is a test of both the
scale as well as the underlying theory driving the test.