Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
net/publication/247737198
CITATIONS READS
76 901
3 authors:
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
iew project
Cross cultural adaptation of the Beliefs About Psychological Services scale for Iceland V
iew project
Cross cultural counseling V
All content following this page was uploaded by Stefania Aegisdottir on 20 March 2015. The user
has requested enhancement of the downloaded file.
The Counseling Psychologist
http://tcp.sagepub.com
Published by:
http://www.sagepublications.com
On behalf of:
Additional services and information for The Counseling Psychologist can be found at:
Subscriptions: http://tcp.sagepub.com/subscriptions
Reprints: http://www.sagepub.com/journalsReprints.nav
Permissions: http://www.sagepub.com/journalsPermissions.nav
Citations http://tcp.sagepub.com/cgi/content/refs/36/2/188
Downloaded from http://tcp.sagepub.com by SJO TEMP 2008 on December 7, 2008
Methodological Issues in Cross-Cultural Counseling
Research:
Concerns about the cross-cultural validity of constructs are discussed, including equiv-alence, bias, and
translation procedures. Methods to enhance equivalence are described, as are
strategies to evaluate and minimize types of bias. Recommendations for translat-ing
instruments are also presented. To illustrate some challenges of cross-cultural
coun-seling research, translation procedures employed in studies published in five
counseling journals are evaluated. In 15 of 615 empirical articles, a translation of
instruments was performed. In 9 studies, there was some effort to enhance and evaluate
equivalence between language versions of the measures employed. In contrast, 2 studies
did not report using thorough translation and verification procedures, and 4 studies
employed a mod-erate degree of rigorousness. Suggestions for strengthening
translation methodologies and enhancing the rigor of cross-cultural counseling
research are provided. To conduct cross-culturally valid research and deliver culturally
appropriate services, counseling psychologists must generate and rely on
methodologically sound cross-cultural studies. This article provides a schema for
performing such studies.
188
Downloaded from http://tcp.sagepub.com by SJO TEMP 2008 on December 7, 2008
Ægisdóttir et al. / CROSS-CULTURAL VALIDITY 189
have tremendous potential to enhance the basic core of the science and practice of
counseling psychology, both domestically and internationally” (p. 147). He
also predicted, “In the future, counseling psychology will no longer be
defined as counseling psychology within the United States, but rather, the
parameters of counseling psychology will cross many countries and many
cultures” (Heppner, 2006, p. 170).
Although an international focus in counseling is important, there are
many challenges (cf. Douce, 2004; Heppner, 2006; Pedersen, 2003). This
article discusses methodological challenges, especially as related to the
translation and adaptation of instruments for use in international and
cross-cultural studies and their link to equivalence and bias. While there has
been discussion in the counseling psychology literature about the benefits
and challenges of cross-cultural counseling and the risks of simply applying
Western theories and strategies cross-culturally, we were unable to locate
publications in our literature detailing how to perform cross-culturally valid
research. There is literature, however, in other areas of psychology (e.g.,
cross-cultural, social, international) that addresses these topics. This article
draws from this literature to introduce counseling psychologists to some
concepts, methods, and issues when conducting cross-cultural research. We
also extend this literature by discussing the potential use of cross-cultural
methodologies in counseling research.
As a way to illustrate some challenges of cross-cultural research, we also
examine, analyze, and evaluate translation practices employed in five
prominent counseling journals to determine the translation procedures
counseling researchers have used and the methods employed to minimize
bias and evaluate equivalence. Finally, we offer recommendations about
translation methodology and ways to increase validity in cross-cultural
counseling research.
According to Triandis, when using the indigenous approach, researchers are mainly
interested in the meaning of concepts in a culture and how such meaning
may change across demographics within a cultural context (e.g., what does
counseling mean in this culture?). With this approach, psychol-ogists often
study their own culture with the goal of benefiting people in that culture.
The focus of such studies is the development of a psychology tailored to a
specific culture without a focus on generalization outside of that cultural
context (cf. Adamopolous & Lonner, 2001). The main chal-lenge with the
indigenous approach is the difficulty in avoiding existing psychological
concepts, theories, and methodologies and therefore deter-mining what is
indigenous (Adamopolous & Lonner, 2001).
Triandis (2000) contended with the cultural approach; in contrast,
psy-chologists often study cultures other than their own by using
ethnographic methods. True experimental methods can also be used within
this approach (van de Vijver, 2001). Again, the meanings of constructs in a
culture are the main focus without direct comparison of constructs across
cultures. The aim is to advance the understanding of persons in a
sociocultural context and to emphasize the importance of culture in
understanding behavior (Adamopolous & Lonner, 2001). The challenge with
this approach is a lack of widely accepted research methodology
(Adamopolous & Lonner, 2001).
Last, Triandis (2000) stated that when using cross-cultural approaches,
psychologists obtain data in two or more cultures assuming the constructs
under investigation exist in all of the cultures studied. Here, researchers are
interested in how a construct affects behavior differently or similarly across
cultures. Thus, one implication of this approach is an increased
understanding of the cross-cultural validity and generalizability of the
theories and/or constructs. The main challenge with this approach is
demonstrating equivalence of constructs and measures used in the target
cultures and also minimizing biases that may threaten valid cross-cultural
comparisons.
In sum, indigenous and cultural approaches focus on the emics, or things
unique to a culture. These approaches are relativistic in that the aim is
studying the local context and meaning of constructs without imposing a
priori definitions of the constructs (Tanaka-Matsumi, 2001). Scholars
rep-resenting these approaches usually reject claims that psychological
theories are universal (Kim, 2001). In the cross-cultural approach, in
contrast, the focus is on the etics, or factors common across cultures (Brislin,
Lonner, & Thorndike, 1973). Here the goal is to understand similarities and
differ-ences across cultures, and the comparability of cross-cultural
categories or dimensions is emphasized (Tanaka-Matsumi, 2001).
Equivalence
hierarchy presupposing a lower level. These are construct (or structural), measurement
unit, and scalar equivalence.
At the lowest level is construct equivalence. A scale has construct
equivalence if it measures the same underlying construct across cultural
groups. Construct equivalence has been demonstrated for many constructs in
psychology (e.g., NEO Personality Inventory-Revised five-factor model of
personality; McCrae & Costa, 1997). With construct equivalence, the
constructs (e.g., extraversion) are considered having the same meaning and
nomological network across cultures (relationships between constructs,
hypotheses, and measures; e.g., Betz, 2005) but need not be operationally
defined the same way for each cultural group (e.g., van de Vijver, 2001). For
instance, two emic measures of attitudes toward counseling may tap
different indicators of attitudes in each culture, and therefore, the measures
may include different items but at the same time be structurally equivalent,
as they both measure the same dimensions of counseling attitudes and
predict help seeking. Yet as their measurement differs, a direct comparison
of average test scores across cultures using a t test or ANOVA, for example,
cannot be performed. The measures lack scalar equivalence (see below).
Construct equivalence is often demonstrated using exploratory and
confirma-tory factor analyses and structural equation modeling (SEM) to
discern the similarities and differences of constructs’ structure and their
nomological networks across cultures.
The next level of equivalence is measurement-unit equivalence (van de
Vijver, 2001; van de Vijver & Leung, 1997). With this type of equivalence,
the measurement scales of the tools are equivalent (e.g., interval level), but
their origins are different across groups. While mean scores from scales with
this level of equivalence can be compared to examine individual dif-ferences
within groups (e.g., using t test), because of different origin, com-paring
mean scores (e.g., t test) between groups from scales at this level will not
provide a valid comparison. For example, Kelvin and Celsius scales have
equivalent measurement units (interval scales) but measure tempera-ture
differently—they have a different origin and, thus, direct comparison of
temperature using these two scales cannot be done. But because of a
con-stant difference between these two scales, comparability may be
possible (i.e., K = C - 273). The known constant or value offsetting the
scales makes them comparable (van de Vijver & Leung, 1997). Such known
constants are difficult to discern in studies of human behavior, rendering
scores at this level often incomparable. A clear analogy in counseling
psychology is using different cut scores for various groups (e.g., gender) on
instruments as an indicator of some criteria or an underlying trait. Different
cut scores (or standard scores) are used because instruments do not show
equivalence beyond the measurement unit. That is, some bias affects the
origin of the
scale for one group relative to the other, limiting raw score comparability between the
groups. For example, a raw score of 28 on the Minnesota Multiphasic
Personality Inventory 2 MacAndrew Alcohol Scale-Revised (Butcher,
Dahlstrom, Graham, Tellegen, & Kaemmer, 2001) does not mean the same
thing for women as it does for men. For women, this score indi-cates more
impulsiveness and greater risk for substance abuse than it does for men
(Greene, 2000). A less clear example but extremely important to
cross-cultural research involves two language versions of the same
psycho-logical instrument. Here the origins of the two language versions of
the scale may appear the same (both versions include the same interval
rating scale for the items). This assumption, however, may be threatened if
the two cultural groups responding to this measure vary in their familiarity
with Likert-type answer formats (method bias; see later). Because of the
differ-ential familiarity with this type of stimuli, the origin of the
measurement unit is not the same for both groups. Similarly, if the two
cultural groups vary in response style (e.g., acquiescence), a score of 2 on a
5-point scale may not mean the same for both groups. In these examples, the
source or the origin of the scale is different in the two language versions,
compro-mising valid cross-cultural comparison.
Finally, and at the highest level of equivalence, is scalar equivalence or
full score comparability. Equivalent instruments at the scalar level measure a
con-cept with the same interval or ratio scale across cultures, and the origins
of the scales are the same. Therefore, at this level, bias has been ruled out,
and direct cross-cultural comparisons of average scores on an instrument can
be made (e.g., van de Vijver & Leung, 1997).
According to van de Vijver (2001), it can be difficult to discern if
measures are equivalent at the measurement-unit or scalar level. This
challenge is observed in comparison of scale scores between cultural groups
responding to the same language version of an instrument as well as between
different language versions of a measure. As an example of this difficulty,
when using the same language version of an instrument, racial differences in
intelligence test scores can be interpreted as representing true differences in
intelligence (scalar equivalence has been reached) and as an artifact of the
measures (measurement-unit equivalence has been reached). In the latter, the
measurement units are the same, but they have different origins because of
various biases, hindering valid comparisons across different racial groups.
In this instance, valid comparisons at the ratio level (comparing mean
scores) cannot be done. Higher levels of equivalence are more diffi-cult to
establish. It is, for instance, easier to show that an instrument mea-sures the
same construct across cultures (construct equivalence) by showing a similar
factor structure and nomological networks than it is to demon-strate the
instruments’ numerical comparability (scalar equivalence). The
higher the level of equivalence, though, the more detailed analysis can be performed on
cross-cultural similarities and differences (van de Vijver, 2001; van de
Vijver & Leung, 1997).
Levels of equivalence for measures used in cross-cultural counseling
research should be established and reported in counseling psychology
pub-lications. It is not until the equivalence of the concepts under study have
been determined that a meaningful cross-cultural comparison can be made.
Without demonstrated equivalence, numerous rival hypotheses (e.g., poor
translation) may account for observed cross-cultural differences.
Bias
This bias has several potential sources. It can result from poor translation or poor item
formulation (e.g., complex wording) and because item content may not be
equally relevant or appropriate for the cultural groups being compared (e.g.,
Malpass & Poortinga, 1986; van de Vijver & Poortinga, 1997). An item on
an instrument is considered biased if persons from different cultures having
the same standing on the underlying characteristic (trait or state) measured
yield different average item scores on the instrument.
Finally, bias can be considered uniform and nonuniform. A uniform bias
refers to any type of bias affecting all score levels on an instrument equally
(van de Vijver and Leung, 1997). For instance, when measuring persons’
intelligence, the scale may be accurate for one group but may consistently
reflect 10 points too much for another group. The 10-point difference would
appear at different intelligence levels (a true score of 90 would be 100, and a
true score of 120 would be 130). A nonuniform bias is any type of bias
differentially affecting different score levels. In measuring persons’
intelligence, the scale may again be accurate for one group, but for the other
group, 10 points are recorded as 12 points. The difference in measured
intelligence for persons whose true score is 90 would be a score of 108
(18-point difference), whereas for persons whose true score is 110, the
differ-ence is 22 points (a score of 132). The distortion is greater at higher
levels on the scale. Nonuniform bias is considered a greater threat in
cross-cultural comparisons than uniform bias, as it influences the origin and
measurement unit (scale) of a scale. Uniform bias affects only the origin of a
scale (cf. van de Vijver, 1998, 2001).
Bias and equivalence are closely related. When two or more language
versions of an instrument are unbiased (construct, method, item), they are
determined equivalent on a scale level. Bias will lower a measure’s level of
equivalence (construct, measurement unit, scalar). Also, construct bias has
more serious consequences and is more difficult to remedy than method and
item bias. For instance, selecting a preexisting instrument for transla-tion
and use on a different language group, the researcher runs the risk of
incomplete coverage of the construct in the target culture (i.e., construct bias
limiting construct equivalence). Method bias can be minimized, for example,
by using standardized administration (administering under simi-lar
conditions using same instructions) and by using covariates, whereas
thorough translation procedures may limit item bias. Furthermore, higher
levels of equivalence are less robust against bias. Scalar equivalence (a
needed condition for comparison of average scores between groups) is, for
instance, affected by all types of bias and is more susceptible to bias than
MEASUREMENT APPROACHES
listed two common strategies employed when using preexisting measures for
multilingual groups. First is the applied approach, where an instrument goes
through a literal translation of items. Item content is not changed to a new
cultural context, and the linguistic and psychological appropriateness of the
items are assumed. It is also assumed there is no need to change the
instrument to avoid bias. According to van de Vijver (2001), this is the most
common technique in cross-cultural research on multilingual groups. The
second strategy is adaptation, where some items may be literally translated,
while others require modification of wording and content to enhance the
appropriateness to a new cultural context (van de Vijver & Leung, 1997).
This technique is chosen if there is concern with construct bias.
Of the three approaches just mentioned (assembly, application, and
adaptation), the application strategy is the easiest and least cumbersome in
terms of money, time, and effort. This technique may also offer high levels
of equivalence (measurement-unit and scalar equivalence), and it can make
the comparison to results of other studies using the same instrument
possi-ble. This approach may not be useful, however, when the
characteristic behaviors or attitudes (e.g., obedience and being a good
daughter or son) associated with the construct (e.g., filial piety) differ across
cultures (lack of construct equivalence and high construct bias) (e.g., Ho,
1996). In such instances, the assembly or adaptation strategy may be needed.
With the assembly approach (emic), researchers may focus on the construct
validity of the instrument (e.g., factor analysis, divergent and convergent
validity), not on direct cross-cultural comparisons. When adaptation of an
instrument is needed in which some items are literally translated, whereas
others are changed or added, cross-cultural comparisons may be challenging,
as direct comparisons of total scores may not be feasible because all items
are not identical. Only scores on identical items can be compared using
mean score comparisons (Hambleton, 2001). The application technique
(etic) to trans-lation most easily allows for a direct comparison of test scores
using t tests or ANOVA because of potential scalar equivalence. For such
comparisons to be valid, however, an absence of bias needs to be
demonstrated.
The applied approach and to some degree the adaptation strategy focus on
capturing the etics, or the qualities of concepts common across cultures. Yet
cultural researchers have criticized it. Berry (1989), for instance, labeled this
practice “imposed etics,” claiming that by using the etic approach,
researchers fail to capture the culturally specific aspects of a construct and
may erroneously assume the construct exists and functions similarly across
cultures (cf. Adamopolous & Lonner, 2001). The advantage of the etic over
the emic strategy, however, is that the etic technique provides the ability to
make cross-cultural comparisons, whereas in the emic approach,
cross-cultural comparison is more difficult and not as direct.
reliability, item-total scale correlations, and item means and variations pro-vides initial
information about instruments’ psychometric properties. A sta-tistical
comparison between two independent reliability coefficients can be
performed (cf. van de Vijver & Leung, 1997). If the coefficients are
signifi-cantly different from each other, the source of the difference should
be examined. This may indicate item or construct bias. Additionally,
item-total scale correlations may indicate construct bias and nonequivalence,
and method bias (e.g., administration differences, differential social
desirability, differential familiarity with instrumentation). Finally, item score
distribution may suggest biased items and, therefore, information about
equivalence. For instance, an indicator (e.g., item or scale) showing
variation in one cultural group but not the other may represent an emic
concept (Johnson, 1998). Therefore, comparing these statistics across
different language versions of an instrument will offer preliminary data
about the instruments’ equivalence (e.g., construct, measurement unit, and
scalar; van de Vijver & Leung, 1997; conceptual and measurement; Lonner,
1985).
Construct (van de Vijver & Leung, 1997), conceptual, and measurement
equivalence (Lonner, 1985) can also be assessed at the scale level. Here,
exploratory and confirmatory factor analysis, multidimensional scaling
techniques, and cluster analysis can be used (e.g., van de Vijver & Leung,
1997). These techniques provide information about whether the construct is
structurally similar across cultures and if the same meaning is attached to the
construct. For instance, in confirmatory factor analysis, hypotheses about the
factor structure of a measure, such as the number of factors, load-ings of
variables on factors, and correlations among factors, can be tested.
Numerous fit indices can be used to evaluate the fit of the model to the data.
Scalar or full score equivalence is more difficult to establish than
con-struct and measurement-unit equivalence, and various biases may
threaten this level of equivalence. Item bias, for instance, influences scalar
equivalence. Item bias can be ascertained by studying the distribution of
item scores for all cultural groups (cf. van de Vijver & Leung, 1997). Item
response theory (IRT), in which differential item functioning (DIF) is
examined, may be used for this purpose. In IRT, it is assumed item
responses are related to an underlying or latent trait using a logistic curve
known as item characteristic curve (ICC). The ICCs for each selected
parameter (e.g., item difficulty or popularity) are compared for every item in
each cultural group using chi-square statistics. Items differing between
cultural groups are eliminated before cross-cultural comparisons are made
(e.g., Hambleton & Swaminathan, 1985; van de Vijver & Leung, 1997).
Item bias can also be examined by using ANOVA. The item score is treated
as the dependent vari-able, and the cultural group (e.g., two levels) and score
levels (levels depen-dent on number of scale items and number of
participants scoring at each
Downloaded from http://tcp.sagepub.com by SJO TEMP 2008 on December 7, 2008
200 THE COUNSELING PSYCHOLOGIST / March 2008
level) are the independent variables. Main effects for culture and the inter-action
between culture and score level are then examined. Significant effects
indicate biased items (cf. van de Vijver & Leung, 1997). Logistic regression
can also be used for this purpose using the same type of independent and
dependent variables. Additionally, multiple-group SEM invariance analy-ses
(MCFA) and multiple group mean and covariance structures analysis
(MACS) also provide information about biased items or indicators (e.g.,
Byrne, 2004; Cheung & Rensvold, 2000; Little, 1997, 2000), with the
MACS method also providing information about mean differences between
groups on latent constructs (e.g., Ployhart & Oswald, 2004).
Finally, factors contributing to method bias can be assessed and
statisti-cally held constant when measuring constructs across cultures, given
that valid measures are available. A measure of social desirability may, for
instance, be used to partially control for method bias. Also, gross national
product per capita may be used to control for method bias, as it has been
found to correlate with social desirability (e.g., Van Hemert, van de Vijver,
Poortinga, & Georgas, 2002) and acquiescence (Johnson et al., 2005).
Furthermore, personal experience variables potentially influencing the
con-struct under study differentially across cultures may serve as covariates.
Translation Methodology
Prior to the development of the ITC standards, Brislin et al. (1973) and
Brislin (1986) had written extensively about translation procedures. The
following paragraphs outline the common translation methods that Brislin et
al. summarized with connotations to the ITC guidelines (e.g., Hambleton &
de Jong, 2003; van de Vijver & Hambleton, 1996). Additional methods to
enhance equivalence of translated scales are also mentioned.
At other times, the original language version of the scale is also changed to
ensure equivalence, a process known as decentering (Brislin et al., 1973).
Adequate back translation does not guarantee a good translation of a scale,
as this procedure often leads to literal translation at the cost of readability
and naturalness of the translated version. To minimize this, a team of back
translators with a combined expertise in psychology and linguistics may be
used (van de Vijver & Hambleton, 1996). It is also important to note that in
addition to the test items, test instructions need to go through a thorough
translation/back-translation process.
METHOD
Sample
Procedure
RESULTS
Table 1 lists results found for each of the 15 studies. Three of the
included studies used a structured or semistructured interview-test protocol.
In 3 studies, of which one included a semistructured test protocol, an
English-language instrument was developed and then translated to another
language. Furthermore, in 9 studies, one or more preexisting measures (the
entire instrument or subset of items) were translated into a language other
than English. In the 15 studies, a range of constructs was examined,
includ-ing persons’ counseling orientations (e.g., help-seeking attitudes,
counsel-ing expectations), adjustment (e.g., acculturation), and
maladjustment (e.g., psychological stress). A diversity of cultural groups
was represented in the
15 studies as well (see Table 1).
Two main criteria were used to evaluate these 15 studies: (a) the
trans-lation methodology employed (single person, committee, back
translation, pretest), which provides judgmental evidence about the
equivalence of the translated measure to the original measure; and (b)
whether statistical methods were used to verify equivalence of the translated
measure to its original-language version. Because the studies ranged in
terms of their pur-pose and the approaches taken when investigating
multicultural groups, and also because these strategies were linked with
different measurement opportunities of equivalence and bias, we divided
these 15 studies into three categories: target-language, cross-cultural, and
equivalence studies. The target-language studies included projects in which
only translated ver-sions of measures were investigated. These studies
employed either cross-cultural (etic) methodologies or a combination of
cultural and cross-cultural methodologies (emic-etic). For these studies,
there was no direct comparison made between an original and a translated
version of the protocol. The second category of studies used a cross-cultural
approach, as they compared two or more groups on a certain construct. Each
of these groups received the original and translated versions of a measure.
Finally, the third category of studies was specifically designed to examine
equiva-lence between two language versions of an instrument. These studies
we termed “equivalence studies.”
We identified studies that employed sound versus weak translation
method-ologies. This task turned out to be difficult, however, because of the
scarcity of information reported about the translation processes used.
Sometimes, the translation procedure was described in only a couple of
sentences. In other instances, the translation methodology was discussed in
more detail
(text continues on p. 210)
Downloaded from http://tcp.sagepub.com by SJO TEMP 2008 on December 7, 2008
TABLE 1: Studies Involving Translation of Instruments
Assigned
Psychometrics Reported
Number, Approach
Citation, Type of Instrument to
Back
and Journal Construct Sample Name Translation Translation
Translation Pretest Original Target
1. Shin, Berkson, & Psychological Immigrants from Six items from the English to Korean Committee Yes
No N/A ATSPPH: Factor analysis
Crittenden (2000); help-seeking Korea Attitudes Toward
AAS: Cronbach's alpha
JMCD attitudes; Seeking Professional
110 Korean immigrants
(N=
traditional values Psychological Help
in U.S.)
(ATSPPH);
Acculturation Attitude
S
c
a
l
e
,
(
A
A
S
)
p
r
i
o
r
t
r
a
n
s
l
a
t
i
o
n
;
V
i
g
n
e
t
t
e
s
d
e
v
e
l
o
p
e
d
i
n
E
n
g
l
i
s
h
2. Engels, Finkenauer, Parental attachment; Dutch adolescents Parent and Peer English to Dutch Committee Yes
No N/A Cronbach's alpha (N=412
Meeus, & Dekovic Relational Attachment (IPPA); (researchers);
(researchers) Dutch
(2001); JCP competence; Perceived Competence unclear what
adolescents)
Self-esteem; Scale for Children; instruments were
Depression Self-Esteem Scale; translated in study
Depressive Mood List
3. Chung & Bemak Anxiety; Southeastern Asian Health Opinion Survey English to Committee Yes
Pilot N/A Exploratory factory analysis
(2002); JCD depression; refugees (interview) Vietnamese,
interviews 867),
for Vietnamese (N=
psychosocial Khmer, Laotian
Cambodian (N=590), and
dysfunction
Laotian (n=723) persons
symptoms
4. Kasturirangan & Culture Latino women A semi structured English to Spanish. Not No
Pilot interview; no N/A Latina professor of foreign
Nutt-Williams Domestic violence interview protocol No discussion of reported
comparison English version language served as an auditor
(2003); JMCD developed by the translation method
between of protocol
to ensure proper translation
researchers: Two
English and
administered to
of transcripts from Spanish
interviews in English,
Spanish version
3) Latina
(n=
to English (n= 7) Latina
seven in Spanish
of protocol
women
women
rior to data
ollection
5. Asner-Self & Attributional style Immigrants from The Attributional Style English to Committee Yes
No N/A Cronbach's alpha, principle
Schreiber (2004); Central America Questionnaire (ASQ) Spanish
components analysis (N=89
MECD
Central American
mmigrants in U.S.)
6. Torres & Rollock Acculturation-related Immigrants from Cultural Adjustment English to Spanish Committee Yes
No Not reported for Cronbach's alpha (N=86
(2004); MECD challenges Central & South Difficulties Checklist
the 10% of the Hispanic immigrants). 90%
America (CADC)
sample that
of the sample responded to
esponded to
the translated version of
his version
instruments. No comparison
reported between the two language versions
(continued)
TABLE 1: (continued)
Assigned
Psychometrics Reported
Number, Approach
Citation, Type of Instrument to
Back
and Journal Construct Sample Name Translation Translation
Translation Pretest Original Target
7. Oh & Neville Korean rape myth Korean college Illinois Rape Myth English to Korean Single person Yes
Yes; Focus group N/A Study 1: Principle components
(2004); TCP acceptance students Acceptance Scale
4 South
(n= analysis followed by
(IRMAS) (26 items
Korean
exploratory factor analysis
from IRMAS were
nationals)
348 South Korean
(N=
translated and includ-
evaluated each
college students). Study 2:
ed in the preliminary
item from
confirmatory Factor
version of the Korean
IRMAS and 26
analysis, factorial
Rape Myth
items generated
invariance procedure,
Acceptance Scale;
from Korean
Cronbach's alpha, &
KRMAS)
literature. All
MANOVA to establish
tems were in
criterion validity (N=547
orean
South Korean nationals ).
Study 3: Test-retest
40 South
reliability (N=
Korean teachers or school administrators)
8. Asner-Self & Depression, anxiety, Immigrants from Brief Symptom English to Spanish Not reported Yes
Not reported Not reported Not reported. No information
Marotta (2005); phobic anxiety; Central America Inventory (BSI);
about number of
JCD Erikson's eight Measures of
participants responding to
psychosocial stages Psychosocial
English or Spanish versions
Development (MPD)
of instruments. Volunteers
robed about the research
xperience.
9. Wei & Heppner Clients' perceptions Counselor-client Counselor Rating English to Mandarin Single person Yes
No N/A Cronbach's alpha,
(2005); TCP of counselor credi- dyads in Taiwan Form-short Version
intercorrelations among
bility; working (CRF-S); The
CRF subscales (construct
alliance Working Alliance
31 counselor/
validity) (N=
Inventory-short
client dyads in Taiwan)
Version (WAI-S)
Cross-cultural studies
10. Marino, Stuart, & Acculturation Anglo-Celtic Developed a English to Vietnamese Committee Yes
Yes (n = 10) Cronbach's alpha, Cronbach's alpha (N=187
Minas (2000); Australians & questionnaire (in
Vietnamese 196
(N=
Vietnamese Australians).
MECD Vietnamese English) measuring
version Anglo-Celtic
Vietnamese participants
immigrants to behavioral and
Australians)
responded to either an English
Australia psychological
or a Vietnamese version of
acculturation, and
the instrument. Statistical
socioeconomic and
evidence of equivalence
demographic
between these two language
influences on
versions of the instrument
acculturation
was not reported
11. Ægisdóttir & Counseling Icelandic & U.S. Expectations About English to Icelandic Committee Yes
Focus Group (n = Cronbach's alpha Cronbach's alpha (N = 261
Gerstein (2000); expectations; college students Counseling
8) Icelandic (N = 225 U.S. Icelandic college students).
JCD Holland's typology Questionnaire (EAC-B);
version college
Covariate analysis (prior
Self-Directed Search
students)
counseling experience) used
(SDS)
to control for method bias.
12. Poasa, Causal attributions U.S., American Questionnaire of English to Samoan Single person Yes
English version of A team of English- A team of Samoan-speaking
Mallinckrodt, & Somoan, & Attribution and
QAC pilot speaking
persons (n = 3)
Suzuki (2000); Western Samoan Culture (QAC;
tested and persons (n = 4)
independently coded the
TCP college students vignettes with open-
respondents independently
Samoan language responses
ended response probes
provided
coded the
from QAC and interviews
developed in English)
feedback to
English-
(N = 50). No information
valuate
language
about if themes/codes were
quivalence
responses from
translated from Samoan to
n = 16)
QAC and inter-
English
iews (N = 23)
13. Tang (2002); Career choice Chinese, A questionnaire English to Chinese Single person Yes
No None reported for None reported for Chinese
JMCD Chinese-American, developed in English (researcher)
Caucasian (N = 120) college students
& Caucasian in the study to
American (N =
American college measure influences on
124) and Asian
students career choice
American
131) college
tudents
Equivalence studies
14. Chang & Wellness Immigrants from The Wellness Evaluation English to Korean Single translator No
Yes (n = 3): None reported for None reported for a larger
Myers (2003); Korea of Lifestyle (WEL) whose translations
Bilingual exam- a larger sample sample (N not reported)
MECD were edited by first
inees took both
(N not
author.
the English and
reported)
Discrepancies
the Korean
resolved between
version. Effect
translator and
size (Cohen's d)
editor upon mutual
of difference in
agreement
mean scores
etween
nglish and
orean version
15. Mallinckrodt & Adult attachment Int'l students from The Experiences in English to Chinese Committee Yes
No Split-half Used bilinguals (n = 30
Wang (2004); JCP Taiwan Close Relationships
reliability, Taiwanese international
Scale (ECRS)
Cronbach's
college students) to evaluate
lpha (N = 399
equivalence using DLSH
.S. college
method: within-subjects
tudents)
t test between two language
versions, split-half reliability, Cronbach's alpha, test-retest reliability and construct validity correlations with a related construct
210 THE COUNSELING PSYCHOLOGIST / March 2008
(e.g., number and qualifications of translators and back translators), while in fewer
instances, examples were provided about acceptable and unacceptable item
translations.
Despite these difficulties, and based on available information, we
con-trasted relatively sound and weak translation procedures. Translation
methods we considered weak did not incorporate any mechanism to evaluate
the trans-lation, including either judgmental (e.g., back translation, use of
bilinguals, pretest) and/or quantitative (statistical evidence of equivalence)
procedures. Instead, such a protocol was translated to one or more languages
without any apparent evaluation about its equivalence to the original
language version. Methodologically sound studies incorporated both
judgmental and quantita-tive methods to assess the validity of the
translation. Given these criteria to evaluate the methodological rigor of the
translation process employed, we now present the analyses of the 15
identified studies in the literature.
committee approach to translation, they relied on back translation, and they employed
one or more independent experts to evaluate the equivalence of the language
forms. They also reported making subsequent changes to the translated
version of the instruments they were using. Additionally, in some of these
studies, a pretest of the translated protocol was performed, and in all of these
projects, the investigators discussed the statistical tests of the measures’
psychometric properties (see Table 1).
The remaining three studies in this category (1, 2, and 6) contained
translation methods of moderate quality, in that their quality ranged in
between those we considered using relatively weak and strong translation
procedures. In fact, the translation process was not fully described.
Furthermore, in one instance, the same person performed the translation and
the back translation (2), and in another (6), no assessment of equiva-lence
was reported on the two language versions of the scale used before
responses were collapsed into one data set. Also, in one study (1), translated
items from an existing scale were selected a priori without any quantitative
or qualitative (e.g., pretest) assurance these items fit the cultural group to
which they were administered. In none of these three studies were the
mea-sures pretested before collecting data for the main study. Finally,
insufficient information was reported about the translated instruments’
psychometric properties to evaluate the validity of the measures for the
targeted cultural groups. The internal validity of these studies could have
been greatly improved had the researchers included some of these
procedures in the translation and verification process.
approach. It is noteworthy that all four studies in this category failed to assess the factor
structure of the different language versions of the mea-sures, and as such,
they did not provide additional support for construct equivalence. Similarly,
none of these studies assessed item bias or per-formed any detailed analyses
to verify scalar equivalence. Employing these additional analyses would
have greatly enhanced the validity of the reported cross-cultural
comparisons in these four studies.
Interpretation of Results
The current results are consistent with Mallinckrodt and Wang (2004),
who discovered in their review of articles published in two counseling
jour-nals (JCP and TCP) that few studies in counseling psychology have
inves-tigated multilingual or international groups or employed translation
methods. Additionally, consistent with these investigators, we found in
many instances, counseling researchers used inadequate procedures to verify
equivalence between language versions of an instrument. For example, our
analyses
indicated just more than half of the 15 studies employed a committee of translators. A
committee is highly recommended in the ITC guidelines (van de Vijver &
Hambleton, 1996).
We also discovered in less than half of the 15 studies that the
measure-ment devices were pretested, and in slightly more than half of the
studies, the researchers used quantitative methods to further demonstrate
equiva-lence. Furthermore, only 1 study systematically controlled for
method bias, while none of the 15 studies assessed for item bias. All these
procedures are recommended in the ITC guidelines. On a positive note,
however, all but 2 studies used a back-translation procedure to enhance
equivalence. Taken together, all of these results are disquieting and lead us
to call for employing more rigorous research designs when studying culture,
when using and evaluating translated instruments, and when performing
cross-cultural comparisons.
Additionally, we found, in many cases, limited attention was placed on
discussing translation methods. Hambleton (2001) also observed this trend.
Not knowing the reason for this lack of effort, we speculate about why
methods of translation were not described in more detail. One reason could
be the lack of importance placed on this methodological feature of a research
design. Another may relate to an author’s desire to comply with page
limitations in journals. A third reason could be a researcher’s failure to
recognize the importance of reporting the details about methods of
trans-lation. Finally, it is conceivable that researchers assume others are
aware of common methods of translation and thus do not discuss the
methods they use in much detail. Whatever the reasons, consistent with the
ITC guide-lines, we strongly suggest investigators provide detailed
information about the methods they employ when translating and validating
instruments used in research. This is especially important, as an
inappropriate translation of a measure can lead to a serious threat to a
study’s internal validity, may con-tribute to bias, and in international
comparisons may limit the level of equivalence between multilingual
versions of a measure. As a threat to internal validity, a poorly translated
instrument may act as a strong rival hypothesis for obtained results.
RECOMMENDATIONS
Translation Practices
Several steps are essential for a valid translation. Based on our and
Brislin and colleagues’ (Brislin, 1986; Brislin et al., 1973) review of
common translation methods and the ITC guidelines (e.g., Hambleton, 2001;
van de Vijver & Hambleton, 1996), the best translation procedure involves
several steps as
Downloaded from http://tcp.sagepub.com by SJO TEMP 2008 on December 7, 2008
214 THE COUNSELING PSYCHOLOGIST / March 2008
1. Independent translation from two or more persons familiar with the target lan-guage and culture
and intent of the scale
2. Documentation of comparisons of translations and agreement on the best translation
3. Rewriting of translated items to fit grammatical structure of target language
4. Independent back translation of translated measure into original language (one or more persons)
5. Comparison of original and back-translated versions, focusing on appropriate-ness, clarity,
meaning (e.g., use rating scales)
6. Changes to the translated measure based on prior comparison. Changed items go through the
translation/back-translation iteration until satisfactory
7. If concepts or ideas do not translate well, deciding what version of the original version of a
scale should be used for cross-cultural comparison (original, back translated, or
decentered)
8. Pretest of translated instrument on an independent sample (bilinguals or target language group).
Check for clarity, appropriateness, and meaning
9. Assessment of the scale’s reliability and validity, absence of bias, and equiva-lence to the
original-language version of the scale
outlined in Table 2. All but the last step outlined in this table help to minimize item and
construct bias and therefore may increase scalar equivalence between
language versions of a measure (ITC development guidelines). The last step
or recommendation refers to verifying cross-cultural validity of measures
(i.e., absence of bias and equivalence; ITC interpretation guidelines).
enhances the validity of the findings. Third, when method bias is not expected but there
is a potential for construct bias while the use of a preexisting mea-sure is
considered feasible, researchers should consider collecting emic items to be
included in the instrument when studying an etic construct (e.g., Brislin,
1976; Oh & Neville, 2004). This approach will enhance construct
equiva-lence by limiting construct bias and will provide culture-specific
information to aid theory development. Fourth, when emic scales are
available in the cul-tures of interest to assess an etic construct and
cross-cultural comparisons are sought, the convergence approach should be
considered. With this approach, all instruments are translated and
administered to each cultural group. Then, items and scales shared across
cultures are used for cross-cultural compar-isons, whereas nonshared items
provide information about the unique aspect of the construct in each culture
(e.g., van de Vijver, 1998). This approach will enhance construct
equivalence, it may deepen the current understanding of cultural and
cross-cultural dimensions of a construct, and it may aid theory development.
Finally, Triandis’s (1972, 1976) suggestion can be considered. With this
procedure, instruments are simultaneously assembled in each cul-ture to
measure the etic construct (e.g., subjective well-being). With this approach,
most or all types of biases can be minimized and equivalence enhanced, as
no predetermined stimuli are used. Factor analyses can be per-formed to
identify etic constructs for cross-cultural comparisons.
CONCLUSION
solution-focused interventions in line with their cultural norms. Similarly, one should
not assume an instrument developed in one culture is appropriate to use and
will yield valid findings about another cultural group.
Counseling psychologists should not only demonstrate cultural
aware-ness, knowledge, and skills to deliver competent mental health
services (American Psychological Association, 2003; Arrendondo et al.,
1996), they should also display this talent in cross-cultural research.
Understanding methods of sound translation and procedures for reducing
bias and enhancing the validity of cross-cultural findings are essential for the
informed scientist-professional. To deliver culturally appropriate and
effective services, coun-seling psychologists must generate and rely on valid
cross-cultural studies. Additionally, we should collaborate with
professionals worldwide. The science and practice of cross-cultural
counseling psychology would be strengthened through this effort. More
important, there would be a greater likelihood that various paradigms of
cross-cultural counseling psychology would be appropriate to the culture,
context, and population being studied and/or served. Ultimately, such
paradigms can contribute to the preservation of different cultures worldwide
and enhance individuals’ quality of life.
REFERENCES
Adamopolous, J., & Lonner, W. J. (2001). Culture and psychology at a crossroad: Historical
perspective and theoretical analysis. In D. Matsumoto (Ed.), The handbook of culture and
psychology (pp. 11-34). New York: Oxford University Press.
American Psychological Association. (2003). Guidelines on multicultural education, training, research,
practice, and organizational change for psychologists. American Psychologist, 58,
377-402.
Arrendondo, P., Toporek, R., Brown, S. P., Jones, J., Locke, D. C., Sanchez, J., et al. (1996).
Operationalization of the multicultural counseling competencies. Journal of Multicultural
Counseling and Development, 24, 42-78.
Asner-Self, K. K., & Marotta, S. A. (2005). Developmental indices among Central American
immigrants exposed to way-related trauma: Clinical implications for counselors. Journal
of Counseling and Development, 83, 162-171.
Asner-Self, K. K., & Schreiber, J. B. (2004). A factor analytic study of the attributional style
questionnaire with Central American immigrants. Measurement and Evaluation in
Counseling & Development, 37, 144-153.
Berry, J. W. (1969). On cross-cultural comparability. International Journal of Psychology, 4, 119-128.
Berry, J. W. (1989). Imposed etics-emics-derived etics: The operationalization of a compelling idea.
International Journal of Psychology, 26, 721-735.
Betz, N. E. (2005). Enhancing research productivity in counseling psychology: Reactions to three
perspectives. The Counseling Psychologist, 33, 358-366.
Brislin, R. W. (1970). Back-translation for cross-cultural research. Journal of Cross-Cultural
Psychology, 1, 185-216.
Brislin, R. W. (1976). Comparative research methodology: Cross cultural studies. International Journal
of Psychology, 11, 213-229.
Downloaded from http://tcp.sagepub.com by SJO TEMP 2008 on December 7, 2008
Ægisdóttir et al. / CROSS-CULTURAL VALIDITY 217
Brislin, R. W. (1983). Cross cultural research in psychology. Annual Review of Psychology, 34,
363-400.
Brislin, R. W. (1986). The wording and translation of research instruments. In W. J. Lonner &
J. W. Berry (Eds.), Field methods in cross-cultural research (pp. 137-164). Beverly Hills, CA: Sage.
Brislin, R. W., Lonner, W. J., & Thorndike, R. M. (1973). Cross-cultural research methods. New
York: John Wiley.
Butcher, J. N., Dahlstrom, W. G., Graham, J. R., Tellegen, A., & Kaemmer, B. (2001). MMPI-2
(Minnesota Multiphasic Personality Inventory 2): Manual for administration and scoring
(Rev. ed.). Minneapolis: University of Minnesota Press.
Byrne, B. M. (2004). Testing for multigroup invariance using AMOS graphics: A road less traveled.
Structural Equation Modeling: A Multidisciplinary Journal, 11, 272-300. Chang, C. Y., &
Myers, J. E. (2003). Cultural adaptation of the wellness evaluation of lifestyle: An
assessment challenge. Measurement and Evaluation in Counseling and Development, 35,
239-250.
Cheung, G. W., & Rensvold, R. B. (2000). Assessing extreme and acquiescence response sets in
cross-cultural research using structural equation modeling. Journal of Cross-Cultural
Psychology, 31, 188-213.
Chung, R. C., & Bemak, F. (2002). Revisiting the California Southeast Asian mental health needs
assessment data: An examination of refugee ethnic and gender differences. Journal of
Counseling and Development, 80, 111-119.
Crowne, D. P., & Marlowe, D. (1960). A new scale of social desirability independent of psychopathology.
Journal of Consulting Psychology, 24, 349-354.
Douce, L. A. (2004). Globalization of counseling psychology. The Counseling Psychologist, 32,
142-152.
Engels, R. C. M. E., Finkenauer, C., Meeus, W., & Dekovic, M. (2001). Parental attachment and
adolescents’ emotional adjustment: The associations with social skills and relational
competence. Journal of Counseling Psychology, 48, 428-439.
Fouad, N. A. (1991). Training counselors to counsel international students. The Counseling
Psychologist, 19, 66-71.
Gerstein, L. H. (2005). Counseling psychologists as international social architects. I n
R. L. Toporek, L. H. Gerstein, N. A. Fouad, G. Roysircar-Sodowsky, & T. Israel (Eds.),
Handbook for social justice in counseling psychology: Leadership, vision, and action
(pp. 377-387). Thousand Oaks, CA: Sage.
Gerstein, L. H., & Ægisdóttir, S. (Eds.). (2005a). Counseling around the world [Special issue]. Journal
of Mental Health Counseling, 27, 95-184.
Gerstein, L. H., & Ægisdóttir, S. (Eds.). (2005b). Counseling outside of the United States: Looking in and
reaching out [Special section]. Journal of Mental Health Counseling, 27, 221-281. Gerstein, L.
H., & Ægisdóttir, S. (2005c). A trip around the world: A counseling travelogue! Journal
of Mental Health Counseling, 27, 95-103.
Greene, R. L. (2000). The MMPI-2: An interpretive manual (2nd ed.). Boston: Allyn & Bacon. Hambleton, R.
K. (2001). The next generation of the ITC test translation and adaptation guide- lines.
European Journal of Psychological Assessment, 17, 164-172.
Hambleton, R. K., & de Jong, J. H. A. L. (2003). Advances in translating and adapting educa- tional
and psychological tests. Language Testing, 20, 127-134.
Hambleton. R. K., & Swaminathan, H. (1985). Item response theory: Principles and applica- tions.
Dordrecht, Netherlands: Kluwer.
Heppner, P. P. (2006). The benefits and challenges of becoming cross-culturally competent
counseling psychologists: Presidential address. The Counseling Psychologist, 34, 147-172. Ho,
D. Y. F. (1996). Filial piety and its psychological consequences. In M. H. Bond (Ed.),
Handbook of Chinese psychology (pp. 155-165). Hong Kong: Oxford University Press.
Jahoda, G. (1966). Geometric illusions and environment: A study in Ghana. British Journal of
Psychology, 57, 193-199.
Johnson, T., Kulesa, P., Cho, Y. I., & Shavitt, S. (2005). The relation between culture and response styles:
Evidence from 19 countries. Journal of Cross-Cultural Psychology, 36, 264-277. Johnson, T.
P. (1998). Approaches to equivalence in cross-cultural and cross-national survey research.
In ZUMA (Centrum fur Unfragen Methoden und Analysen)-Nachrichten Spezial Band 3:
Cross-Cultural Survey Equivalence (pp. 1-40). Retrieved from http://www.gesis
.org/Publikationen/Zeitschriften/ZUMA_Nachrichten_spezial/zn-sp-3-inhalt.htm
Kasturirangan, A., & Nutt-Williams, E. (2003). Counseling Latina battered women: A qualita-
tive study of the Latina perspective. Journal of Multicultural Counseling and Development,
31, 162-178.
Kim, U. (2001). Culture, science, and indigenous psychologies. In D. Matsumoto (Ed.), The
handbook of culture and psychology (pp. 51-76). New York: Oxford University Press. Leong,
F. T. L., & Blustein, D. L. (2000). Toward a global vision of counseling psychology. The
Counseling Psychologist, 28, 5-9.
Leong, F. T. L., & Ponterotto, J. G. (2003). A proposal for internationalizing counseling psychology in the
United States: Rationale, recommendations, and challenges. The Counseling
Psychologist, 31, 381-395.
Leung, S. A. (2003). A journey worth traveling: Globalization of counseling psychology. The
Counseling Psychologist, 31, 412-419.
Little, T. D. (1997). Mean and covariance structures (MACS) analyses of cross-cultural data: Practical
and theoretical issues. Multivariate Behavioral Research, 32, 53-76. Little, T. D. (2000). On
the comparability of constructs in cross-cultural research: A critique of Cheung and
Rensvold. Journal of Cross-Cultural Psychology, 31, 213-219. Lonner, W. J. (1985). Issues in
testing and assessment in cross-cultural counseling. The Counseling Psychologist, 13,
599-614.
Lonner, W. J., & Berry, J. W. (Eds.). (1986). Field methods in cross-cultural research. Beverly Hills,
CA: Sage.
Mallinckrodt, B., & Wang, C.-C. (2004). Quantitative methods for verifying semantic equiva- lence of
translated research instruments: A Chinese version of the Experiences in Close
Relationships Scale. Journal of Counseling Psychology, 51, 368-379.
Malpass, R. S., & Poortinga, Y. H. (1986). Strategies for design and analysis. In J. W. Berry &
J. W. Lonner (Eds.), Cross-cultural research and methodology series: Vol. 8. Field meth-ods in cross-cultural
research (pp. 47-83). Beverly Hills, CA: Sage.
McCrae, R. R., & Costa, P. T. (1997). Personality trait structure as a human universal. American
Psychologist, 52, 509-516.
Oh, E., & Neville, H. (2004). Development and validation of the Korean rape myth acceptance scale.
The Counseling Psychologist, 32, 301-331.
Pedersen, P. B. (1991). Counseling international students. The Counseling Psychologist, 19, 10-58. Pedersen,
P. B. (2003). Culturally biased assumptions in counseling psychology. The
Counseling Psychologist, 31, 396-403.
Ployhart, R. E., & Oswald, F. L. (2004). Applications of mean and covariance structure analysis:
Integrating correlational and experimental approaches. Organizational Research Methods, 7,
27-65.
Poasa, K. H., Mallinckrodt, B., & Suzuki, L. A. (2000). Causal attributions for problematic family
interactions: A qualitative, cultural comparison of Western Samoa, American Samoa, and the
United States. The Counseling Psychologist, 28, 32-60.
Ponterotto, J. G., Casas, J. M., Suzuki, L. A., & Alexander, C. M. (Eds.). (1995). Handbook of
multicultural counseling (2nd ed.). Thousand Oaks, CA: Sage.
Shin, J. Y., Berkson, G., & Crittenden, K. (2000). Informal and professional support for solving
psychological problems among Korean-speaking immigrants. Journal of Multicultural
Counseling and Development, 28, 144-159.
Downloaded from http://tcp.sagepub.com by SJO TEMP 2008 on December 7, 2008
Ægisdóttir et al. / CROSS-CULTURAL VALIDITY 219