Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
To cite this article: Mohammad Khatib & Saeed Rezaei (2013) A model and questionnaire of
language identity in Iran: a structural equation modelling approach, Journal of Multilingual and
Multicultural Development, 34:7, 690-708, DOI: 10.1080/01434632.2013.796958
Taylor & Francis makes every effort to ensure the accuracy of all the information (the
“Content”) contained in the publications on our platform. However, Taylor & Francis,
our agents, and our licensors make no representations or warranties whatsoever as to
the accuracy, completeness, or suitability for any purpose of the Content. Any opinions
and views expressed in this publication are the opinions and views of the authors,
and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content
should not be relied upon and should be independently verified with primary sources
of information. Taylor and Francis shall not be liable for any losses, actions, claims,
proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or
howsoever caused arising directly or indirectly in connection with, in relation to or arising
out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any
substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,
systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &
Conditions of access and use can be found at http://www.tandfonline.com/page/terms-
and-conditions
Journal of Multilingual and Multicultural Development, 2013
Vol. 34, No. 7, 690708, http://dx.doi.org/10.1080/01434632.2013.796958
English Language and Literature Department, Allameh Tabataba’i University, Tehran, Iran
(Received 14 November 2012; final version received 27 March 2013)
Introduction
Language as an identification badge provides one of the best telling clues for people’s
identity and where they belong to. This symbiotic relationship between language and
identity is immensely supported in the literature and recent publications also
corroborate this close affinity (Block 2007; Edwards 2009; Joseph 2004; Liamas and
Watt 2010; Ricento 2005). In spite of this close relation between language and
identity, the fuzziness and malleability of identity has limited the studies to mainly
qualitative approaches. Since identity research was initially conducted by anthro-
pologists, sociologists and psychologists, a review of studies in these disciplines
indicate that quantitative approaches are more welcomed in these fields (e.g. Phinney
1992; Van Zomeren, Postmes, and Spears 2008). This is in sharp contrast to identity
research in language studies where quantitative methods are usually neglected.
Nevertheless, this tendency towards quantitative research in neighbouring disciplines
has also affected language identity research and recent studies have also endorsed
this trend (e.g. Ehala 2012; Polat and Mahalingappa 2010). Though qualitative
approaches such as ethnography, interviewing and diary studies have been very
fruitful, the best way to overcome the shortcomings in these research methods seems
to be developing a framework or model for research. Attempts to develop such
models have been very successful and include models of language, culture and
identity in different countries and contexts including Israel (Golan-Cook and
Olshtain 2011).
Considering all the above-mentioned arguments, this study pursued three main
objectives. The first objective of this study was to develop a tentative hypothesised
model of language identity in Iran. As the second objective of this study, a
questionnaire was developed and validated to test the hypothesised model. Finally, in
the last phase the data gathered through this questionnaire were fed into the model
to see to what extent the model fit the data.
Downloaded by [Tulane University] at 04:33 04 September 2014
The study
Researching identity in applied linguistics is achieved through a number of
methodological tools including interviews, ethnographic observation and question-
naires inter alia (Rezaei 2012). Although interview and ethnography are two valuable
research tools, they are usually time consuming and costly for administration and
scoring. The potential practical problems inherent in interviewing and ethnographic
observations make the use of validated questionnaires a viable solution.
Although some researchers have used questionnaires as a way to collect their data
in identity and attitudinal studies (e.g. Shaaban and Ghaith 2003), they have mainly
Downloaded by [Tulane University] at 04:33 04 September 2014
research as the whole research focused on identity and its relation with language.
Globalisation and language teaching and learning issues (e.g. Block and Cameron
2002; Coupland 2010) were also utilised as language identity is predominantly
affected by globalisation. Attitude is the main source to identity and as Dyer (2007)
puts it, part of individual’s identity is formed through accent and phonology. In other
words, language identity can be partially recognised through the accent, dialect or
pronunciation that an individual adopts. Hence, part of the model provided here is
devoted to pronunciation attitudes based on the works in the literature (e.g. Garrett
2010; Jenkins 2007). Besides, sociolinguistics of identity (e.g. Omoniyi and White
2006) and sociology of language (e.g. Bourdieu 1991; Giles and Clair 1979; Spolsky
2011) were informative to develop the model in this study because this study falls
within the sociolinguistic domain of language studies and how language is a
prevailing social factor in identity formation.
Iranians consider Persian language as one of the main pillars of national identity
Downloaded by [Tulane University] at 04:33 04 September 2014
in Iran. Hence, theories and works on language and national identity (e.g. Joseph
2004; Simpson 2007) were also useful. In addition, recent studies show that English
language learners have started to adopt their local English types as legitimised
forms of English and hence the works on World Englishes and postcolonialism
(Brutt-Griffler 2002; Pennycook 1998) helped in composing some of the items in the
questionnaire.
One component of the model was related to how people associate their social
status in the society to the language variety they adopt. The degree people associate
their social status to the language (Persian, English, etc.) they speak is affected by the
extent they value the language they use. In addition, language policy issues in the
literature (e.g. Ricento 2006; Shohamy 2006; Spolsky 2003) were also used because in
Iran the dominant language policy is to value Persian language written in Arabic
script. Minority languages or non-official languages such as Azari and Kurdish are
not recognised for instruction purposes at schools and universities in spite of
the large number of people in Iran speaking these languages. Finally, some local
works in Iran about Persian language and identity were helpful in shaping the model
(e.g. Meskoob 1992).
After reviewing the above-mentioned literature, a number of components were
specified to encapsulate language identity in Iran. In order to confirm the representa-
tiveness, appropriateness and accuracy of these components, a cadre of experts on
linguistics, sociolinguistics and sociology in Iran and abroad were consulted. The
interviews with these experts were held both in Persian and English and took between
30 to 60 minutes. The content of the interviews pivoted on the components of language
identity in Iran. The interviewees were first asked what they constituted as language
identity and the components they mentioned were written down. At the end of the
interviews, the components they proposed were compared with the ones we had
selected a priori. In some interviews at the end of the interview sessions, the priori
selected components were shown to the interviewees to reflect on. This gave them
food for thought to decide about what they had articulated and thereby helped them
give more constructive comments. After all these substantive discussions, the
components were respecified and reconfigured with some minor changes in the
labels of the components and accordingly one new component was also added.
Having reviewed the literature on language and identity, we drafted out six main
components for language identity in Iran including attachment to the Persian
language, pronunciation attitude, language and social status, L1 use/exposure in the
694 M. Khatib and S. Rezaei
exposure in the society was another component that was related to the vitality of
Persian language in the society and if Persian is used by the English language
learners in the social context in Iran. Some English language learners in Iran become
so mesmerised and attached to English language that they use English in their face-
to-face or online daily conversations and communications. Language knowledge was
also important because part of Iranians’ (language) identity is manifested in their
Persian language and literature. In other words, poetry and literary texts have been
very influential in Iranians’ identity. Finally, script and alphabet were included in the
model because Iranians have been scribing in Arabic alphabet from the time Arabs
invaded Persia in the seventh century. Since then Arabic script is used for writing;
however, in the twentieth century a group of Iranian intellectuals proposed Latin to
be used as the writing alphabet in Iran, which raised certain concerns in Iran and this
proposal was finally rejected. Nowadays, many Iranians use Latinised Persian
(Penglish) in their online communication or in their text messaging. There seems to
be a strong proclivity in Iranian younger generation to write Persian in Latin
alphabets. That is why script/alphabet was also included in the model to show how
far Iranians prefer Latin or Arabic for their writing system. Table 1 below shows the
definition for each of the identified components of language identity in Iran.
Respondents
This study happened between July 2011 and August 2012, and the respondents were
English language learners in Iran from different language proficiency levels, ages,
Journal of Multilingual and Multicultural Development 695
Component Definition
1 Attachment to the Persian This component refers to how people in Iran think and feel
language about their language in comparison to English language as the
main foreign language.
2 Pronunciation attitude This component refers to Iranians’ attitudes towards their
pronunciation patterns in Persian and English and which
pronunciation they perceive as desirable.
3 Language and social status This component shows how individuals associate their social
status with the language in which they speak. In other words,
are they proud of their own language or do they associate
their low or high social status to the language they speak?
4 L1 use/exposure in the It refers to the extent Iranians use Persian in their daily life in
society comparison to other competing languages, in this case
English.
Downloaded by [Tulane University] at 04:33 04 September 2014
5 Language knowledge It refers to how much information Iranians have about their
own language, its history and literature.
6 Script/alphabet It refers to how Iranians feel about the alphabet and writing
system in their language.
Questionnaire development
In order to develop a reliable and valid questionnaire, the researchers went through
the following steps.
the items for the questionnaire, not only did the researchers check the related
questionnaires already developed by others, but also asked a number of figures in the
field working on language identity to provide some good items for the questionnaire
(expert opinion).
In addition, in developing the items the researchers tried to include the same
number of positively and negatively worded items. In other words, some weak
questionnaires might be developed in a way that most of the responses fall on either
the negative or the positive side of the rating scale (e.g. strongly agree). In this
questionnaire, the researchers avoided this bias and instead provided a balanced
number of positively and negatively worded items. However, for later analyses these
items went through reverse coding.
The rating scale utilised in the current study was based on Likert scale as the most
popular and widely used one named after its inventor, Rensis Likert. The researchers
employed six options including strongly agree, agree, slightly agree, slightly disagree,
disagree and strongly disagree. It should be mentioned here that the researchers initially
opted for five-option type including: strongly agree, agree, no idea (undecided),
disagree and strongly disagree. However, after reviewing the literature on question-
naire development (e.g. Dörnyei 2010), the researchers came to know that Iranians are
generally conservative in their responses in spite of anonymity and might merely
choose ‘no idea: undecided’ in some seemingly sensitive items. As a result, six-option
type was selected so that the respondents could not hedge. Another reason for doing so
was making the data result in normal distribution. Respondents showed their degree
of agreement/disagreement to each statement on a six-point Likert-type scale. To score
the items, ‘strongly agree’ received six points, ‘agree’ five points, ‘slightly agree’ four
points and so on. Scoring was reversed for the negatively worded items.
nonexperts was as valuable as the one from the experts. Content representativeness
and bias were simultaneously investigated. This panel of experts included profes-
sionals in the field of applied linguistics, sociolinguistics, sociology, Iranian studies
and survey design and statistics.
The panel of experts were requested to rate the items based on a Likert-type scale
from one to four. In this scale, one designated ‘Not important to be included in the
survey’, two was ‘Somehow important to be included’, three ‘Important to be included’
and finally four meant ‘Extremely important to be included in the survey’. They were
additionally asked to pen in a final decision on the item by selecting either ‘omit’ or
‘keep’ the item as the final decision on each item.
The results of the responses obtained from this step reduced the items from 40 to
26 items. Subsequently, 14 items were discarded due to a number of reasons
mentioned by the panel including the redundancy, ambiguity, length and irrelevance
of the items. The criteria to keep an item or omit it from the questionnaire were based
Downloaded by [Tulane University] at 04:33 04 September 2014
The title of the questionnaire, that is, language identity questionnaire, was
removed during the administration because it might have affected the participants’
responses.
In developing the items, the researchers were also careful not to make double-
negative items because they would sometimes make the items confusing.
Age, education, language proficiency level, etc. were initially generated as open-
ended in this questionnaire. However, they were later turned into pre-determined
categories to ease later analyses.
After considering all the above points, the questionnaire was administered for an
initial piloting. Up to this point, 26 items were generated. Since this was the initial
pilot study, the questionnaire was administered to 36 students similar to the target
population for which the questionnaire was designed. In order to administer
the questionnaire, the researchers used the traditional method, that is, by hand.
The feedbacks were very helpful in modifying some of the items and discarding one.
Downloaded by [Tulane University] at 04:33 04 September 2014
To establish the content validity of the questionnaire, the questionnaire was given
to a panel of experts, as discussed above, to judge how far the items were repre-
sentative of a language identity questionnaire. Moreover, the experts reflected on the
wording and the interpretation of the items, and also the instructions given there. To
check the content validity, the questionnaire was also given to five English language
learners from the target population to respond to using think-aloud techniques.
After running these two stages for checking the content validity, some changes were
implemented in the items resulting in rewording of some items. All these changes,
that is, face and content validity, were made prior to the reliability phase. In other
words, content validity was done before estimating the reliability.
After all these steps, the researchers came up with 20 items tapping the six
components of language identity in Iran. Table 2 below shows the six components in
the questionnaire,1 their related items and their reliability indices.
In order to establish the construct validity, two procedures were employed. At
first the questionnaire was checked for its congruency with the theories in the
literature regarding language and identity as discussed above. This step was done
iteratively by checking the items with the researchers in the literature. Next,
exploratory and confirmatory factor analyses were utilised in two separate
administration occasions to statistically check the validity. Nevertheless, a number
of criteria must be met before running factor analysis.
The first step in factor analysis is to assess the suitability of the data for factor
analysis. In order to determine the suitability of the data for factor analysis, two
criteria must be met; ‘sample size and the strength of association among the variables
(or items)’ (Pallant 2007, 180). Regarding the sample size, there are different views
among researchers, the most conservative of which is the larger the better. In this
study, the criterion was that of five to ten respondents for each item. One hundred
and ninety-three participants who took part in the exploratory factor analysis phase
met this criterion.
The second criterion concerning the suitability of running factor analysis is
related to the inter-correlations among the items in the questionnaire. Bartlett’s test
of sphericity and the Kaiser-Meyer-Olkin (KMO) measure determine this criterion.
In order for these two options to indicate factorability for the data, Bartlett’s test of
sphericity should be significant, that is, p B0.05 and KMO index that ranges from
700 M. Khatib and S. Rezaei
0 to 1 should not be below 0.6, otherwise the data will not be considered appropriate
for running factor analysis. For the current study as shown in Table 3, the KMO and
Bartlett’s test result showed that KMO measure was above 0.60 (KMO 0.76) and
also the Bartlett’s test of sphericity was significant (p 0.00). These two values
assume that there are some significant factors to be extracted from the data.
After determining the factorability of the data, factor analysis was run based on
principle components analysis (PCA). In order to decide about the number of factors
to be retained, the Kaiser’s criterion according to which only the eigenvalues of
1.0 and more were selected. For the current questionnaire, the scree plot in Figure 1
indicates 6 factors above eigenvalue 1. The six factors accounted for 77.24% of the
total variance (usually anything over 60% is good in this case). These six factors
accounted for 28.96%, 18.93%, 9.31%, 8.64%, 6.40% and 5.28% of the total variance,
respectively.
Variable communalities were greater than 0.30 for all the items. Communality
values for this questionnaire ranged from 0.53 to 0.74. The factor correlations for the
questionnaire were all at acceptable levels with the highest correlation between factor
1 and factor 4 (r 0.71), followed by 1 and 2 (r 0.70), 2 and 3 (r 0.68), 3
and 4 (r 0.67), 1 and 6 (r 0.65), 4 and 5(r 0.62), 1 and 3 (r 0.55), 2 and
5 (r 0.55), 4 and 6 (r 0.53), 3 and 6 (r 0.50), 2 and 6 (r 0.48), 1 and
Downloaded by [Tulane University] at 04:33 04 September 2014
Componenta
1 2 3 4 5 6
v1 0.655
v2 0.623
v3 0.879
v4 0.567
v5 0.675
v6 0.723
v7 0.574
v8 0.562
v9 0.598
v10 0.553 0.565
v11 0.617 0.656
Downloaded by [Tulane University] at 04:33 04 September 2014
In order to test the hypothesised model, AMOS 21 was run and maximum
likelihood method was used to estimate the parameters. The participants who took
part in this part of the study were 482 English language learners who ranged in age
from 15 to 35 years with a mean age of 22 years. They were from different parts of
the country possessing different demographic characteristics. The researchers
deliberately did so to test the model for the whole country rather than limiting it
to the capital city. Table 5 shows the descriptive statistics (e.g. age, gender, ethnicity
and language proficiency level) for the participants in this phase of study. As can be
seen in the data presented in Table 5, some of the respondents did not fill out the
parts about their demographic information (missing data) and subsequently 468
participants were included for SEM. The demographic information gathered here
was intended to be included as variables in the model. Nevertheless, these variables
were excluded from the model to avoid complexity. For models such as the one in
this study at their nascent stages, it is highly suggested not to make them
convoluted. However, future studies2 can utilise these variables as latent and hence
discover the relations among all these variables.
In order to report the model fitness, there are three common absolute fit indices
including:
- x2 according to which nonsignificant x2 (p 0.05) indicates good fit;
- Root Mean Squared Error of Approximation (RMSEA); acceptable fit B0.10
and good fit B0.05; hence the smaller the RMSEA, the better and fitter the
model is; and
- Goodness of Fit (GFI) 0.90 is considered as good fit.
Downloaded by [Tulane University] at 04:33 04 September 2014
1115 1620 2125 25 Total Male Female Total Fars Azari Kurd Arab Lor Other Total Basic Elementary Pre-inter Inter High Inter Advanced Total
46 56 128 249 479 211 268 480 298 58 41 40 33 5 475 22 46 52 67 92 200 479
9.6% 11.7% 26.7% 52% 43.9% 55.8% 62.7% 12.2% 8.6% 8.4% 6.9% 1% 4.5% 9.6% 10.8% 13.9% 19.2% 41.7%
703
704 M. Khatib and S. Rezaei
In this study, absolute fit indices were taken into account because there was no
previous model to test this model against. The initial results of SEM showed poor
fitness for the model. The reasons for this lack of fitness were related to the
complexity of the model, which included not only the six factors but also some
demographic information. Hence, some changes were made in the model to make it
fit the data. These changes included removing some of the restrictions in the model
including the demographic information and instead focusing on the main factors
proposed a priori. In addition, one of the items (item 12) was removed because it
showed low factor loadings. Hence, the model was revised and SEM was once again
run. The output of the second SEM showed x2 4.42, df 155, p 0.02, which
shows a significant value for Chi-square. Since Chi-square value is dependent on
sample size and is usually significant for 400 samples and more, x2/df is used as a
solution, which is 4.42/155 0.02 and is considered as an acceptable degree (see
Table 6). The results of the second SEM also indicated GFI 0.974 and RMSEA
Downloaded by [Tulane University] at 04:33 04 September 2014
0.00, which were also acceptable. Table 6 shows the indices for SEM and shows a
desirable level of fitness based on the output from AMOS 21. Hence, all the indices
are at an acceptable level and the model seems to be a fit model. In other words, the
data gathered in this study seem to support this model.
Figure 2 shows the schematic representation of the recursive model of language
identity in Iran. Path coefficients are also put on the pathways from each latent
variable to other latent or observable variables to show the strength of relation or
correlation among the variables.
x2 4.42 p 0.05
x2/df 0.02 B3
GFI 0.98 0.90
RMSEA 0.00 B 0.05
Journal of Multilingual and Multicultural Development 705
Downloaded by [Tulane University] at 04:33 04 September 2014
Figure 2. Final model of language identity for English language learners in Iran.
Note: F1, F2, F3, F4, F5 and F6 are the factors identified in EFA.
questionnaire, the researchers hereby recommend that more rigorous studies are
required to test this model and probably add more components and subcomponents
to this model as is the case with other models in language studies (e.g. communicative
competence model). In other words, although the data gathered in this study through
a reliable and valid questionnaire seem to have fit the model, this should not make
this model vaccinated for any other deficiencies. The researchers believe that other
replication studies, collecting data from different groups of Iranians are required to
reduce confounding variables and subsequently enhance the reliability and validity of
this model.
706 M. Khatib and S. Rezaei
Acknowledgements
Acknowledgements are to the Iranian Ministry of Science, Research, and Technology that
funded and supported this research. We would also like to thank professor Ingrid Piller at
Macquarie University for her support, critical insights and intellectual guidance. Special
thanks are also extended to Dariush Izadi for his help and encouragement during this project.
Notes
1. The complete version of the questionnaire in both English and Persian is available upon
request.
2. This study is part of a larger PhD project and the model and the questionnaire developed
here are utilised in a follow-up nation-wide survey to study the language identity level of
Iranian English language learners from different age groups, genders and language
proficiency levels.
References
Alderson, C. J., and Banerjee, J. 1996. How Might Impact Study Instruments be Validated?
Unpublished manuscript commissioned by UCLES.
Baker, C. 2011. Foundations of Bilingual Education and Bilingualism. 5th ed. Bristol:
Multilingual Matters.
Balistreri, E., N. A. Busch-Rossnagel, and K. F. Geisinger. 1995. ‘‘Development and
Preliminary Validation of the Ego Identity Process Questionnaire.’’ Journal of Adolescence
18 (2): 179192. doi:10.1006/jado.1995.1012.
Bani-Shoraka, H. 2009. ‘‘Cross-Generational Bilingual Strategies among Azerbaijanis in
Tehran.’’ International Journal of the Sociology of Language 198: 105127. doi:10.1515/
IJSL.2009.029.
Beeman, W. O. 2010. ‘‘Sociolinguistics in the Iranian World.’’ In The Routledge Handbook of
Sociolinguistics in the World, edited by M. J. Ball, 139148. London: Routledge.
Block, D. 2007. Second Language Identities. London: Continuum.
Block, D., and D. Cameron, eds. 2002. Globalization and Language Teaching. London:
Routledge.
Boroujerdi, M. 1998. ‘‘Contesting Nationalist Constructions of Iranian Identity.’’ Critique 12:
4355. doi:10.1080/10669929808720120.
Bourdieu, P. 1991. Language and Symbolic Power. Translated by G. Raymond and
M. Adamson and edited by J. B. Thompson. Cambridge: Polity Press. (original work
published in 1982)
Brady, S., and W. J. Busse. 1994. ‘‘The Gay Identity Questionnaire.’’ Journal of Homosexuality
26 (4): 122. doi:10.1300/J082v26n04_01.
Journal of Multilingual and Multicultural Development 707
Van Zomeren, M., T. Postmes, and R. Spears. 2008. ‘‘Toward an Integrative Social Identity
Model of Collective action: A Quantitative Research Synthesis of Three Socio-
psychological Perspectives.’’ Psychological Bulletin 134 (4): 504535. doi:10.1037/0033-
2909.134.4.504.
Windfuhr, G. L. 2009. ‘‘Persian.’’ In The World’s Major Languages, 2nd ed., edited by
B. Comrie, 445459. London: Routledge.
Yarshater, E. 1993. ‘‘Persian Identity in Historical Perspective.’’ Iranian Studies 26 (12):
14142. doi:10.1080/00210869308701791.