Peterson AmetaanalysisofCronbachAlpha

Journal of Consumer Research, Inc.
A Meta-Analysis of Cronbach's Coefficient Alpha

Author(s): Robert A. Peterson
Reviewed work(s):
Source: Journal of Consumer Research, Vol. 21, No. 2 (Sep., 1994), pp. 381-391
Published by: The University of Chicago Press
Stable URL: http://www.jstor.org/stable/2489828 .
Accessed: 12/02/2013 21:59
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
The University of Chicago Press and Journal of Consumer Research, Inc. are collaborating with JSTOR to
digitize, preserve and extend access to Journal of Consumer Research.
http://www.jstor.org
This content downloaded on Tue, 12 Feb 2013 21:59:56 PM

All use subject to JSTOR Terms and Conditions
A Meta-analysis of Cronbach's
Coefficient Alpha
ROBERT A. PETERSON*
Despite some limitations, Cronbach's coefficient alpha remains the most widely
used measure of scale reliability. The purpose of this article was to empirically
document the magnitudes of alpha coefficients obtained in behavioral research,
compare these obtained values with guidelines and recommendations set forth by
individuals such as Nunnally(1967, 1978), and provide insights into research design
characteristics that may influence the size of coefficient alpha. Average reported
alpha coefficients ranged from .70 for values and beliefs to .82 for job satisfaction.
Withfew exceptions, there were no substantive relationships between the magnitude
of coefficient alpha and the research design characteristics investigated.
There is virtualconsensus among researchersthat, last dozen years Nunnally has been cited in supportof
for a scaleto be valid and possesspracticalutility, obtained reliabilitycoefficientsmore than 50 times in
it must be reliable.Conceptually,reliabilityis defined the Journal of Marketing Research and three dozen
as "the degree to which measures are free from error times in the Journalof ConsumerResearch.It is inter-
and thereforeyield consistent results" (Peter 1979, p. esting, though, that Nunnally changed his reliability
6). As such, the reliabilityof a scale places a limit on recommendations from his 1967 edition of Psycho-
its constructvalidity. metric Theoryin his 1978 edition. In 1967, he recom-
However,despiteits importance,thereis surprisingly mended that the minimally acceptable reliability for
little guidance in the literatureas to what constitutes preliminaryresearchshould be in the rangeof .5 to .6,
"acceptable" or "sufficient" reliability for research whereasin 1978 he increasedthe recommendedlevel
purposes. Table 1 contains illustrative recommenda- to .7 (without explanation).
tions regardingminimally acceptable reliability. Al-
though the recommendations differ somewhat, they Purpose
share two commonalties. First, they indicate that the
requireddegreeof reliabilityis a functionof the research The purpose of conductingthe presentresearchwas
purpose, whetherthe researchis exploratory,applied, twofold. First, it was to empiricallyascertainand doc-
or so forth. For example, a scale in the preliminary ument the magnitudesof reliabilitycoefficientsactually
stagesof developmentis generallynot thoughtto require obtained in empiricalstudies and comparethese coef-
the reliability of one used to discriminate between ficientswith the recommendationsset forthin Table 1.
groups or of one being used to make decisions about A second purpose was to determine whether relation-
individuals.Second,none of the recommendationshave ships respectivelyexist between the magnitudeof a re-
an empiricalbasis, a theoreticaljustification,or an an- liability coefficient and selected individual difference
alytical rationale.Rather,they appearto reflect either constructs and researchdesign characteristics.If reli-
"experience"or intuition. ability coefficients systematicallyvary across well-de-
Of the recommendationscontainedin Table 1, those finedindividualdifferenceconstructsor researchdesign
of Nunnally (1967, 1978) are the most widely refer- characteristics,it might be possible to derive "compar-
enced, either in supportor criticism of an obtainedre- ison standards"analogous to the base rates developed
liabilitycoefficient.For example,both Churchill(1979) by Peterson, Albaum, and Beltramini (1984) in their
and Peter (1979), in widely cited articles, endorsed investigation of effect sizes in consumer behavior ex-
Nunnally's recommendations.More generally, in the periments.Such comparisonstandardswould comple-
ment existingguidelineslike those in Table 1and should
*RobertA. Petersonholdsthe John T. StuartIII CentennialChair facilitate the interpretation of reliability coefficients
in BusinessAdministrationand is the CharlesE. HurwitzFellow at obtained in empiricalstudies.
the IC2Institute,Universityof Texas at Austin, Austin, TX 78712. In additionto beingmotivatedby a lackof consistent,
Thisresearchwassupportedin partby the IC2Institute.The opinions generalizableinformation regardingthe magnitude of
expressedin the articlearethose of the authorand do not necessarily reliabilitycoefficientstypically obtained in behavioral
reflectthe views of the institute.
research;the present article was motivated in part by
IQ
? 1994by JOURNALOF CONSUMERRESEARCH,Inc. * Vol. 21 * September1994

All rightsreserved.0093-5301/95/2102-0012$2.00

382 JOURNAL OF CONSUMER RESEARCH
TABLE 1 pling approach(Bangert-Drowns1986). Churchilland

SELECTED RECOMMENDEDRELIABILITY
LEVELS Peter selectively sampled at most two reliabilitycoef-
ficients from each study reviewed, the largestand the
Recommended smallest reported reliability coefficients, whereas the
Author Situation level present study used all alpha coefficientsfound in a re-
viewed study.
Davis (1964, p. 24) Prediction for individual Above .75
Prediction for group of .5
25-50 CoefficientAlpha
Prediction for group Below .5
over 50 Oversimplifying somewhat, there are two general
Kaplan and Saccuzzo categoriesof reliabilitycoefficients,those based on lon-
(1982, p. 106) Basic research .7-.8 gitudinaldata (e.g., the test-retestreliabilitycoefficient)
Applied research .95
Murphyand Davidshofer and those based on cross-sectionaldata (e.g., internal
(1988, p. 89) Unacceptable level Below .6 consistency reliabilitycoefficientsand equivalence re-
Low level .7 liability coefficients).By far the most commonly used
Moderate to high level .8-.9 reliabilitycoefficientis coefficientalpha, an estimator
High level .9
Nunnally
of internal consistency.
(1967, p. 226) Preliminaryresearch .5-.6 Coefficientalphawas developedby Cronbach(1951)
Basic research .8 as a generalizedmeasureof the internal consistency of
Applied research .9-.95 a multi-item scale. It is formulatedas
Nunnally (1978, pp.
245-246) Preliminaryresearch .7 k
k
Basic research .8 I
Applied research .9-.95 o, = (k -( i z /f
or
kr-
the researchof Churchilland Peter (1984) and bears 1 + r(k - 1)'
many similarities to their work (see also Peter and
Churchill1986). Even so, this researchdiffersfrom that where k is the number of items in the scale, u- is the
of Churchilland Peterin severalregards.First,there is varianceof item i, o- is the varianceof the scale, and r
a subtle differencein the objectives,and consequently is the averageinteritem correlation.
the scope, of the two endeavors.Althoughboth studies Whetherby acclamation (see, e.g., Churchill 1979;
can be describedas meta-analyses,Churchilland Peter Gerbing and Anderson 1988; Peter 1979) or citation,
focused on investigating"the effectsof researchdesign coefficientalpha has effectivelybecome the measureof
on reliabilityestimates" (p. 360), whereasthe current choice for estimatingthe reliabilityof a multi-itemscale.
focus was somewhat broader in that it attempted to Indeed, coefficientalpha has become one of the foun-
providebase rate comparisonstandardsat a construct dations of measurementtheory. Because it is a gener-
level. alizedintraclasscorrelationcoefficient,coefficientalpha
The differencein objectivesresultedin three notable can be derivedfrom the theoryof true and errorscores,
study differences.First, Churchill and Peter included as well as from the domain samplingmodel. According
six differentreliabilitycoefficientsin their analyses.Be- to the Social Science CitationIndex, Cronbach's1951
cause of the admonitions of Guilford (1965, chap. 17) article has been referencedin more than 2,200 articles
thatdifferentreliabilitycoefficientsare not conceptually in the last 20 years. Not only is coefficient alpha the
or numericallycomparable,the present investigation most widely used estimatorof reliability,but also it has
focused on only one reliabilitycoefficient,Cronbach's been the subject of considerable methodological and
(195 1) coefficientalpha. Second, the presentstudy was analyticalattention(see, e.g., Cortina 1993). Therefore,
significantlylargerin scope in terms of the journals re- focusing on coefficient alpha should not detract from
viewed,the yearscovered,and the numberof reliability the generalityof the research.Instead,it shouldimprove
coefficientsanalyzed.For example,Churchilland Peter the usefulnessof the research,because there is no het-
limitedtheirinvestigationto marketing-related journals erogeneityin the data due to the presence of other re-
over 12 years, whereas the present researchincluded liability coefficients.
both marketingand psychologyjournals over 33 years.
This resultedin nearly30 times the numberof reliability METHODOLOGY
coefficientsanalyzedby Churchilland Peter.Third,the
two studies differed in their sampling approaches. To obtain a large number and wide representation
Churchilland Peterfollowedwhat mightbe termedthe of alpha coefficients,an extensive literaturereviewwas
Mansfieldand Busse sampling approach,whereasthe undertaken.A census of eight psychology- and mar-
present researchfollowed a traditional Glassian sam- keting-relatedjournals was conducted, beginningwith

COEFFICIENTALPHA 383
the year 1960 (the yearin whichcitationsof Cronbach's from this category must take this caveat into consid-
work began to appear) and ending with 1992. This eration.
meant that everyarticlepublishedin thesejournalswas More than 33,000 articles, proceedingspapers, and
systematicallyand individuallyexamined to locate al- rejectedmanuscriptswereindividuallyexaminedduring
pha coefficients.In addition, a convenience sample of the course of data collection. From these, alpha coef-
selectedissues of 16 otherjournals (e.g., JournalofAd- ficients were obtained from 832 differentarticles,pro-
vertisingResearch,Journal of Business Research, Ed- ceedings papers, and manuscriptsreportingdata from
ucational and PsychologicalMeasurement,Journal of 1,030 samples consisting of more than 300,000 indi-
EducationalPsychology, and others) and two confer- viduals. Thus, on averagethere were 5.1 alpha coeffi-
ence proceedings (American Marketing Association, cients per reviewedpublication or manuscriptand 4.1
Associationfor ConsumerResearch)wereexaminedfor alpha coefficientsper sample. To obtain a conceptually
alpha coefficients. Finally, a sample of unpublished coherent pool of alpha coefficients, only those alpha
manuscripts(manuscriptsthat had been rejectedat least coefficientsreportedfor ratingscales designedto mea-
once for publication)were examined for alpha coeffi- sureindividualdifferenceconstructssuchas personality,
cients. The "harvesting"of alpha coefficientsand the attitude, and opinion in nonspecial populations were
coding of individualdifferenceconstructsand research included in the analysis. Excluded were alpha coeffi-
design characteristicswere done by the authorand two cients reportedfor forced-choicescales (e.g., constant-
experienced researchassistants, with the bulk of the sum scales), scales used to measure interrateragree-
coding being done by the latter two individuals after ment, and scales developed or designed for special
an extensive pretestand a trainingsession. A compar- populations (e.g., institutionalizedindividuals).
ison of the researchassistants'coding consistency for a
sample of researchdesign characteristicsproduced an ConstructsMeasured
interrateragreementcoefficient of .92 (Perreaultand
Leigh 1989). Hence, coding was judged to be consis- All harvestedalpha coefficientswere categorizedac-
tently done (as would be expected given the relatively cordingto the underlyingconstructbeing measured.In
straightforwardnature of the task). most instances this was accomplished by simply ac-
Table 2 containsthe sourcesof the alpha coefficients cepting the constructdesignationemployedin a study.
analyzed in the study, as well as the years they were However, in certain instances it was necessaryto infer
the appropriateconstruct categoryon the basis of no-
reviewed, the type of search undertaken,the number menclatureand terminologyused.
of coefficientsobtained from each, and the mean and Afteran extensivereviewof the behavioralliterature
median alpha coefficient observed for each source. and a preliminaryanalysis, 42 categoriesof constructs
Across the sources, 4,286 alpha coefficientswere har- were initially constructed for classification purposes.
vested. All alpha coefficientswere independent in the Becauseof overlapamong the categoriesand the small
sense that they were estimated from distinct sets of numbersof alphacoefficientsin some of the categories,
items. For example, if alpha coefficientswere reported the number of categories was ultimately collapsed to
for both an "overall" or composite scale and one or 20, which included a miscellaneouscategory.As might
more subscales of that scale, only the subscale alpha be expected,the categoriesvariedin termsof their level
coefficientswere retainedfor analysis. Eliminatingthe of generalityand the number of scales they contained.
composite scale from the meta-analysisminimized the The specific construct categoriesused are reportedin
possibilityof interdependenciesamong the alpha coef- Table 3. The largestcategoryconsisted of attitudecon-
ficients. structs,with 699 alpha coefficients.The smallest cate-
Severalaspectsof Table 2 meritbrief mention. First, gory contained constructsrelatingto expectation,with
as is apparent from the table, it was not possible to 37 alpha coefficients.
conduct a census of all journals back to 1960 because
some, such as the Journal of ConsumerResearch, did
not begin publication until after that date. Second, as Research Design Characteristics
previously noted, journals were purposely selected to For each alphacoefficientharvested,informationwas
providea representativearrayof journals, researchdo- soughton 12 researchdesigncharacteristicsin addition
mains, and alpha coefficients. Finally, unpublished to source and year of publication. The researchdesign
manuscriptsconsisted of manuscriptssubmittedto the characteristicsexamined in this article have been pos-
journals and conferenceproceedingslisted in Table 2. ited for more than 65 years as influencingthe size of a
Manuscriptsclassified as unpublished can only be so reliabilitycoefficient(see e.g., Symonds 1928). Each is
defined for the submissions evaluated. Some of the brieflydescribedbelow.
manuscriptsclassified as unpublished may well have Althoughmeasurementtheory does not considerthe
been previouslyrejectedor subsequentlypublished in effect of sample size on the magnitude of an alpha coef-
an outlet unknownto the presentresearcher.Therefore, ficient, Churchilland Peter (1984) observeda negative
any interpretationof the alpha coefficients obtained relationshipbetween the two. Because they were not

TABLE 2
SOURCES OF ALPHA COEFFICIENTS
Source Time period covered Data collection Number of a's Mean a Median a
AMA/ACR Proceedings 1971-1992 Sample 113 .76 .77

Journal of Applied Psychology 1960-1992 Census 670 .79 .81
Journal of Consumer Research 1974-1992 Census 166 .80 .81
Journal of Marketing 1960-1992 Census 238 .76 .78
Journal of M'arketingResearch 1964-1992 Census 639 .76 .79
Journal of Personality and Social Psychology 1960-1992 Census 724 .76 .79
Journal of Personality Assessment 1960-1992 Census 586 .77 .80
Journal of the Academy of Marketing
Science 1972-1992 Census 387 .75 .76
Psychological Reports 1960-1992 Census 418 .76 79
Other journalsa 1970-1992 Sample 30 .79 .82
Unpublished manuscriptsb 1980-1992 Sample 315 .76 .77
aSee text for illustrativejournals.

bSee text for description of unpublished manuscripts.
able to explain their finding, the present researchre- the literatureon this subjectrevealsthat the reliability
studied the relationship. of a scale is expectedto increasewith an increasein the
Type of sample was operationalizedas college stu- number of items only undercertain conditionsthat re-
dent, consumer, businessperson,"mixed" (more than late to the homogeneity of individual item variances.
one type), or "cannot tell" (sample unclassifiable). Contraryto popular belief, it is not clear that simply
Churchill and Peter (1984) hypothesized that college increasingthe numberof items in a scale will guarantee
student samplesshould evince higherscale reliabilities that its reliabilitywill also increase. Consequently,the
than should noncollege student samples because stu- presentresearchagainaddressedthe issue from the per-
dents should be more experiencedin completingques- spective of a largebody of alpha coefficients.
tionnairesand perhapsmore educated.However,their Analogous to the Churchill and Peter research,the
hypothesiswas not supported.Given their findingand present study investigatedthe effect of scale type,for-
a lack of conceptual support for the hypothesis of dif- mat, and nature on the magnitude of an alpha coeffi-
ferent alpha coefficientsfor differenttypes of samples, cient. Type of scale was operationalizedby whetherthe
type of sample was not expected to influence the size scale consisted of traditionalLikertitems (i.e., declar-
of an alpha coefficient. ative statementswith a five-category"agree-disagree"
Two researchdesigncharacteristicshavebeen studied response format) or semantic differentialitems (i.e.,
extensively for their effects on the magnitudesof reli- seven-categorybipolar items). Scale format was oper-
ability coefficients-the numberof categories, points, ationalized as the specific labeling of scale-item cate-
or intervalsin a scale item, and the numberof items in gories (i.e., whether only endpoints were labeled,
a scale. The effect of the number of categorieson the whethernumericalor verballabels were used on inner
size of a reliabilitycoefficienthas long been debatedin categories,or whether it was impossible to label cate-
the literature,with, for example, Bendig(1954) and Ja- gories). The nature of the scale was operationalizedby
coby and Matell (1971) concludingthat the magnitude whethertherewas an odd or even numberof scale-item
of a reliabilitycoefficientis independentof the number categoriesor whetherit was impossibleto tell how many
of scale categoriesand Komorita and Graham (1965) categoriesthere were. Given the findings obtained by
and Lissitz and Green (1975) concluding the opposite. Churchilland Peter, no differencesin the magnitudes
With the exception of the Churchill and Peter (1984) of alphacoefficientswereexpectedas a functionof scale
research,which found a positive relationshipbetween formator scale nature.However,it was anticipatedthat
the numberof scalecategoriesand the size of a reliability scales consisting of semantic differentialitems would
coefficient,prior researchon the number-of-categories exhibit largeralpha coefficientsthan would scales con-
issue either relied on relatively small samples or used sisting of Likert items if a relationshipexists between
a simulationapproach.It was anticipatedthat the pres- the number of categoriesin a scale item and the mag-
ent researchwould resolve these conflicting findings nitude of an alpha coefficient (simply because of the
throughits synthesisof a largebody of alphacoefficients. differencein the number of item categories).
The formula for coefficient alpha implies that the The final characteristicstudied that Churchill and
largerthe number of items in a scale, the greaterits Peter also investigatedwas mode of scale administra-
reliability. This relationship is generally taken for tion. Although Churchilland Peter did not offer a hy-
granted in the literature, and Churchill and Peter's pothesis regardingmode of scale administrationand
findingscorroboratedit. Even so, carefulinspection of the magnitudeof a reliabilitycoefficient,in the present

TABLE3
ALPHA COEFFICIENTSEXHIBITEDBY SELECTED INDIVIDUALDIFFERENCECONSTRUCTS
Quartile
95% Confidence interval
Construct N Mean a Median a for a, First Third
Attitude 699 .76 .79 ?.010 .69 .86

Conflict/stress 378 .78 .81 ?.012 .73 .87
Cognition/knowledge 74 .81 .84 ?.029 .75 .91
Emotion (affect, mood, etc.) 234 .80 .84 ?+016 .75 .89
Expectation 37 .73 .81 ?.058 .58 .87
Intention 46 .81 .84 ?.043 .73 .93
Involvement/commitment 94 .79 .80 ?.025 .72 .87
Lifestyle/interest 65 .74 .77 ?.031 .65 .84
Motivation 86 .76 .78 ?.029 .68 .87
Perceived risk 50 .75 .75 ?.024 .70 .83
Perception 601 .77 .79 ?.010 .70 .86
Performance (job-related) 89 .81 .83 ?.028 .74 .90
Personality 544 .75 .79 ?.012 .69 .85
Preference 57 .80 .81 ?.024 .76 .86
Reported behavior 235 .71 .72 ?.017 .63 .82
Satisfaction (job) 174 .82 .83 ?.013 .77 .88
Satisfaction (other) 135 .79 .83 ?.018 .75 .89
Self-confidence/self-esteem 102 .76 .79 ?.020 .71 .82
Value/belief 297 .70 .73 ?.017 .63 .86
Miscellaneousa 289 .76 .78 ?+016 .69 .86
alncludes such constructs as loyalty, innovativeness, and importance, each with fewer than 30 alpha coefficients.
studyit wasexpectedthatself-administeredscaleswould plied, or whether this could not be determined. (Al-

exhibit largeralpha coefficientsthan would scales that though Churchilland Peter did study a design charac-
were not self-administeredbecause of the likelihood of teristic they termed "sourceof scale," it is not directly
therebeingless ambiguityand confusionassociatedwith comparable to the design characteristicinvestigated
scaleitems that an individualcould physicallyview and here.) If the scale was developed, a fourth researchde-
complete at his or her own pace. In their analysis, sign characteristic was investigated. Specifically, an
Churchilland Peterdid not find a relationshipbetween analysisof developedscaleswas conductedto determine
the administrationmode and the magnitudeof the re- whether there were scale items deleted during the de-
liability coefficientsthey analyzed. velopment process and, if so, whether the number of
Four researchdesign characteristicsnot directly in- itemsdeletedfroma scale influencedthe size of its alpha
vestigatedby Churchilland Peterwerealso studied.One coefficient.
such characteristicwas scale orientation,which indi- It is unfortunatethat, because of reportingpractices,
cated whethera scale was stimulus-or respondent-cen- it was not alwayspossibleto capturethe desiredresearch
teredor both. Becauserespondent-centeredscaleshave design characteristicsof every study reviewed.Hence,
been the focus of more measurement attention than not all alphacoefficientsharvestedwereincludedin ev-
stimulus-centeredscales (Cox 1980), it was anticipated ery analysis.
that the formerwould exhibit largeralpha coefficients
than the latter.A second researchdesign characteristic RESULTS
studied was the nature of the constructrepresentedby
a scale,whetherit was designatedprimarilyas a depen- Figure 1 illustratesthe distributionof the 4,286 alpha
dent or an independentvariableor as both or whether coefficientsharvested.The coefficientsrangedfrom .06
it was impossibleto designateit at all. It was postulated to .99 with a mean of .77 and a median of .79. As can
that, because of the likelihood that a greateremphasis be observedfrom the figure,the coefficientswere rela-
would be placed on a dependent variable during the tively tightlygrouped(SE = .002), and therewas a slight
conceptual and operational development of a study, negative skew (sk =-1.15) to the distribution.
largeralpha coefficientswould be exhibited by depen- Seventy-fivepercentof the observedalphacoefficients
dent variablesthan by independentvariables. were .70 or greater,49 percentwere .80 or greater,and
The third researchdesign characteristicnot directly 14 percent were .90 or greater.These three values cor-
investigatedby Churchilland Peter was the type of re- respond to Nunnally's 1978 recommendations (pp.
search, whether a scale was developed specificallyfor 245-246) for minimally acceptablereliabilitylevels for
the reportedresearch,whetherit was simply being ap- preliminary,basic, and applied research,respectively.

FIGURE 1 tigated.In addition,approximate95 percentconfidence

PERCENTAGEDISTRIBUTIONOF 4,286 ALPHA COEFFICIENTS intervals are presentedfor each of the means, and the
firstand thirdquartilesof coefficientalphaarereported.
20
The table reveals that, with the exception of the type
of research(scaledevelopmentor scale application)and
15
scale format, there were statisticallysignificantdiffer-
Percentage
of ences (p < .001) between at least two levels of each of
0
alpha 1 the remaining research design characteristics.In vir-
coefficients tually every instance, though, significant differences
i5
weredue to the largesamplesizes employedratherthan
to largedifferencesamongor betweenalphacoefficients.
0
0.00 0.25 0.50 0.75 1.00 There are few substantively significant or practically
Coefficientalphavalues meaningfuldifferences.For only three researchdesign
characteristics-the number of scale-item categories,
the number of scale items, and the self- versus inter-
In general, the majority of reportedalpha coefficients viewer-administration mode-were differencesof 1.051
surpass the minimal standards recommended in or greaterobservedin mean alpha coefficients.
Table 1. Althoughstatisticallysignificant,the relationshipbe-
As mentionedpreviously,Table 2 contains the num- tween coefficient alpha and the number of scale-item
ber and mean value of alpha coefficientsfound in each categorieswas not especiallystrong.A regressionanal-
sourceexamined.The table revealsthat the mean alpha ysis produced an r2 of only .01, which indicated that
coefficientobservedin the psychology-relatedjournals only 1 percent of the variancein coefficientalpha was
(a-' = .77) was not significantlydifferentfrom that ob- explainedor accountedfor by the numberof categories
servedin the marketing-relatedjournals and proceed- in a scale item (the Churchill and Peter analysis pro-
ings examined (a- = .76). Further, because the mean duced an r2of .05). The majordifferencein mean alpha
alpha coefficientsfor the published studies (a- = .77) coefficientswasbetweena scaleitem with two categories
and the unpublishedstudies (a- = .76) were not signif- (-= .70) and scale items with more than two categories
icantly different,all alphacoefficientswere included in (= .77).
the remaininganalyses. Analogous to the relationshipbetween the number
It is instructiveto note that the mean observedalpha of item categoriesand coefficientalpha,the relationship
coefficient, .77, is very similar to the mean observed by between the numberof items and coefficientalpha was
Churchilland Peter, .75, in their review of reliability not especially strong. A regressionanalysis of the re-
coefficientsreportedin a sample of marketing-related lationship resultedin an r2 of .10, which was the same
publications.Alpha coefficientsreportedpriorto 1976 as that obtainedby Churchilland Peter.The majordif-
averaged .71; those reported after 1976 averaged .77. ference in mean alpha coefficientswas between scales
Although this difference is statistically significant (p with two or three items (a- = .73) and those with more
< .001), the reason for it is not clear. The difference than threeitems (a-= .78). Scaleswith 11 or more items
may be due to a variety of factors, including method- exhibited the largestalpha coefficients,.81 on average.
ologicalimprovementsand/or reportingpractices,or it A 2 X 2 ANOVA conducted on the number of item
simply may reflect changing standardsas reflectedby categories(two categories,threeor more categories)and
Nunnally's (1967, 1978) recommendations. numberof items (two or threeitems, fouror moreitems)
For each of the constructsinvestigated,the number revealed no statistically significant interaction effect.
of alpha coefficientsobserved, the mean and median The simple effects (shown in Fig. 2 as means), though,
alpha coefficients,the approximate 95 percent confi- suggestthat researchersshould consideravoidingscales
dence intervalfor the mean (Feldt,Woodruff,and Salih with less than four items when there are only two cat-
1987), and the firstand third alphacoefficientquartiles egories per item. (The limited number of alpha coeffi-
are reportedin Table 3. No systematicrelationshipbe- cients in the two-categoryand two or three item cell
tween the type of construct measuredand the magni- precludesextensive analyses and suggeststhat caution
tude of coefficientalphawasobserved.Mean alphacoef- be used when making inferences.)
ficients ranged from .70 for values and beliefs to .82 for As expected, scales that were self-administeredex-
job satisfaction.Differencesbetween alpha coefficients hibitedlargeralphacoefficients(a-= .77) than did those
in excess of I.041 for pairs of constructs are generally that wereadministeredby an interviewer(a-= .72). This
statisticallysignificantat .05. However,becauseof large finding, though, must be tempered by the disparate
differencesin sample sizes, caution must be exercised sample sizes and relativelysmall numberof alpha coef-
when interpretingmean alpha coefficientdifferences. ficients for the interviewer mode of administration.
Table 4 reports the number of and the mean and Contraryto expectations,however,respondent-centered
median values of alpha coefficientsobtained for differ- scales exhibited slightly smaller alpha coefficients (a-
ent levels of the researchdesign characteristicsinves- = .76) than did stimulus-centeredscales (a = .79), de-

TABLE 4
RELATIONSHIPBETWEENCOEFFICIENTALPHA AND SELECTED RESEARCH DESIGN CHARACTERISTICS
Quartile
95% Confidence interval
Construct N Mean a Median a for a,, First Third
Sample size:a
<100 1,028 .76 .80 ?.009 .68 .87
100-199 1,169 .78 .80 ?.007 .71 .87
200-299 696 .78 .80 ?.008 .72 .86
300 or more 1,265 .75 .77 ?.007 .68 .84
Not given 128 .76 .80 ?.003 .67 .86
Type of sample:a
College students 1,741 .77 .80 ?.007 .70 .87
Consumers 879 .74 .76 ?.009 .66 .84
Businesspersons 1,130 .77 .80 ?.007 .70 .86
Mixed 450 .78 .79 ?.010 .72 .86
Cannot tell 86 .76 .77 ?.025 .69 .87
Number of scale categories:a
Not given 667 .76 .78 ?.009 .68 .85
2 221 .70 .74 ?.022 .63 .82
3 158 .78 .80 ?.020 .69 .88
4 305 .76 .78 ?.014 .69 .86
5 1,319 .77 .79 ?.007 .71 .86
6 249 .75 .77 ?.016 .68 .84
7 991 .78 .82 ?.009 .72 .88
8 or more 376 .77 .81 ?.015 .70 .88
Number of items:a
Not given 342 .77 .79 ?.015 .71 .86
2 307 .73 .75 ?.016 .63 .83
3 519 .73 .75 ?.012 .65 .84
4 503 .76 .78 ?.011 .68 .85
5 441 .78 .79 ?.011 .71 .86
6 377 .75 .77 ?.013 .69 .84
7 210 .76 .78 ?.016 .69 .85
8 183 .73 .77 ?.025 .65 .85
9 117 .80 .83 ?.017 .74 .89
10 311 .74 .78 ?.015 .65 .85
11 or more 976 .81 .83 ?.007 .76 .89
Scale type:a
Likert 828 .76 .79 ?.009 .70 .86
Semantic differential 372 .80 .82 ?.013 .74 .89
Scale format:
Only endpoints labeled 811 .77 .80 ?.010 .69 .87
Numerical values on inner categories 553 .77 .80 ?.011 .69 .86
Verbal values on inner categories 1,869 .77 .80 ?.006 .71 .86
Cannot tell 1,053 .76 .78 ?.008 .68 .85
Nature of scale:a
Odd number of item categories 2,756 .78 .80 ?.005 .70 .86
Even number of item categories 863 .74 .76 ?.010 .70 .86
Cannot tell 667 .76 .78 ?.009 .68 .85
Administrationmode:a
Self 4,064 .77 .79 ?.004 .70 .86
Interviewer 153 .72 .75 ?.028 .65 .85
Not given 69 .77 .78 ?.029 .70 .87
Scale orientation:a
Respondent-centered 3,017 .76 .79 ?.002 .69 .86
Stimulus-centered 1,206 .79 .81 ?.004 .73 .88
Both 63 .79 .78 ?.018 .72 .87
Nature of construct:a
Dependent 919 .79 .82 ?.008 .73 .89
Independent 1,897 .77 .79 ?.006 .70 .86
Cannot tell/both 1,470 .75 .78 ?.007 .67 .85
Type of research:
Scale development 1,270 .77 .79 ?.007 .70 .86
Scale application 2,978 .77 .79 ?.005 .70 .86
Cannot tell 38 .84 .84 ?.029 .78 .91
aRelationship significant at p < .001.

FIGURE2 TABLE 5
RELATIONSHIPBETWEENCOEFFICIENTALPHAAND NUMBER SCALE ITEMS DURINGSCALE
EFFECT OF ELIMINATING
OF SCALE ITEMSAND ITEMCATEGORIES DEVELOPMENTON COEFFICIENTALPHA
Numberof scale items
Number of items eliminateda N Mean a
2 or 3 4 or more
None 234 .70
1-3 64 .79
4-10 69 .80
11-30 63 .77
2 a=.62 a=.71
Number of More than 30 63 .87
categories
in item (n =23) (n = 186) aRelationship significant at p < .001.
Peter [ 1984] work), no empirical standards existed

3+ oc = .74 a = .78
againstwhich obtained alpha coefficientscould be sys-
tematically compared. Researchersattempting to in-
terpret an obtained alpha coefficient previously only
(n =710) (n = 2,536) had recommendationssuch as those offeredin Table 1
or were forced to rely on experience or intuition. The
presentresearchprovidesempiricalstandardsthat per-
mit direct comparisons.By comparingan observedal-
spite the fact that respondent-centeredscales averaged pha coefficient with coefficients reported in Tables 3
nearlytwo items more per scalethan stimulus-centered and 4 that were obtained under similar circumstances,
scales. "actuarial-type"insights are available regardingthe
Table 4 revealsthat the averagealpha coefficientdid magnitudeof the observedcoefficient.
not vary as a function of whether a scale was being Across the 4,286 alpha coefficients, 1,030 samples,
developedor applied(i.e., the scale had been developed and 832 studiesinvestigated,the mean coefficientalpha
previously).Table 5 indicates,though,that when a scale was .77. Seventy-fivepercentof the observedalphacoef-
was developed,its alphacoefficientwas significantlyre- ficients were .70 or greater.These values compare fa-
latedto the numberof items thatwereeliminatedduring vorablywith the recommendationsset forth in Table 1
the developmental process (r2 = .18). Four hundred for preliminaryor basic research.This agreementbe-
ninety-threescales that were reportedin the reviewed tween the recommendationsand reportedalpha coef-
literatureas being developedhad informationregarding ficientsis neither surprisingnor likely to be coinciden-
the number of items eliminated during the develop- tal. Because the recommendations have effectively
mental process. It is apparentthat eliminating items become sacrosanct,it can be persuasivelyarguedthat
significantlyincreasedthe averagealpha coefficientof reportedalpha coefficients(especiallythose that are as-
a scale. (Scales with more than 30 items eliminated sociatedwith developedscales)are, on average,in large
duringthe developmentalprocesstendedto be verylong measurea function of the recommendations.
scalesthatwereessentiallyused as item pools to develop Only 14 percent of the observed alpha coefficients
new "shortform"scale versions.)Eliminatingeven one reached or exceeded .90, the threshold generally rec-
item from a scale during the developmental process ommended for applied researchby authoritiessuch as
(usuallyon the basis of a statisticalcriterionas opposed Nunnally. However,the implicationsof this latterfind-
to a theoreticalone) significantlyincreased coefficient ing are not altogetherclear. It may be that the recom-
alpha from .70 to a minimum of .77. In general,elim- mended threshold levels for applied researchare un-
inating items duringscale developmentincreasedcoef- realisticallyhigh for consumerbehaviorand marketing
ficient alpha to an averageof .81. research.Or, since most behavioral researcherschar-
acterize their researchas basic (especially those com-
DISCUSSION AND CONCLUSIONS paring their alpha coefficientswith the recommended
standards),the absence of coefficientsreachingthe .90
The results of this study provide at least tentative standardmay be of little consequence.Indeed, the rel-
answersto a variety of frequentlyasked questions re- ative absence of alpha coefficientsat or above .90 may
gardingcoefficientalpha. First,the resultsprovideem- actually reflect good research practice. Boyle (1991),
pirical evidence as to what is a "typical" alpha coeffi- for instance, has arguedthat scalesexhibitingvery high
cient or, more precisely, what constitutes "high" or alpha coefficients(e.g., above .90) should be avoided,
"low"alphacoefficientsrelativeto previouslyobtained because they simply imply a high level of item redun-
coefficients,both in generaland for specific constructs dancy, not scale reliability.
and selected researchdesign characteristics.Until this An investigation of alpha coefficients.90 or greater
study (with the possible exception of the Churchilland revealedthat, relativeto alphacoefficientsless than .90,

TABLE 6
SUMMARYCOMPARISON OF CHURCHILLAND PETER (1984) AND PRESENT RESEARCH FINDINGS
Relationship with coefficient alpha
Research design characteristic Churchilland Peter Present research
Sample size Sample size negatively related to alpha No substantive relationship

Type of sample No relationship No substantive relationship
Number of scale-item categories Number of item categories positively related to Scale items with two categories exhibited
magnitude of alpha smaller alphas than those with more than
two categories
Number of items in scale Positive relationship between number of items Scales with two or three items exhibited
and size of alpha smaller alphas than those with more than
three items
Scale type No relationship No substantive relationship
Scale format No relationship No relationship
Scale nature No relationship No substantive relationship
Administrationmode No relationship Interviewer administration produced lower
alpha than did self-administration
Spale orientation Not studied No substantive relationship
Nature of construct Not studied No substantive relationship
Type of research Not directly studied No relationship
Number of items deleted during
scale development Not studied Positive relationship between the number of
items deleted and the magnitude of alpha
their originatingscales were more likely to consist of betweenthe mode of administrationand the magnitude
more items with more categories,to be derived from of coefficient alpha in the Churchill and Peter study
smaller samples, to be stimulus-centered,and to have may have been a consequenceof their samplesize and/
been developedto measureconstructsemployed as de- or operationalizationof the design characteristic.The
pendent variables.Succinctly stated, there appearsto finding in the present researchthat self-administered
be a systematic relationshipbetween the origin of an scales exhibited larger alpha coefficients than inter-
alpha coefficient and whether it is "sufficientlyhigh" viewer-administeredscalesis intuitivelylogical. Hence,
(in excess of .90) to warrantbeing deemed acceptable the mode of administrationwarrantsattentionin future
for applied research. researchas an influencerof coefficientalpha.
Second, the present researchgenerallycorroborated Furthermore,the presentresearchconfirmsprevious
the findings of Churchill and Peter (1984; Peter and findingsfrom simulation studies (see, e.g., Jenkins and
Churchill 1986) to the extent that it documented that Taber 1977; Lissitz and Green 1975) and empirical
coefficientalpha is relativelyrobust and is not subject demonstrations (see, e.g., Bendig 1954; Jacoby and
to dramaticfluctuationsas a consequence of research Matell 1971) that, with the exception of scale items
design characteristics.As summarizedin Table 6, de- possessingonly two responsecategories,the numberof
spitetheirunderlyingdifferences,the Churchilland Pe- categoriesin a scale item is essentiallyunrelatedto the
ter researchand the present researchobtained similar magnitudeof coefficientalpha. Thus, the researchadds
results for six of the eight design characteristicsinves- to the body of knowledge relating to the "optimal"
tigated in common. The two studies differed in their number of scale categories(see, e.g., Cox 1980) in that
findings regardingthe impact of sample size and the it showsthatto increasereliability(asestimatedby coef-
mode of administrationon the magnitudeof coefficient ficient alpha) it is necessaryto assume a strategyother
alpha, with Churchilland Peter finding a relationship than simply increasingthe number of scale-item cate-
for the formerbut not for the latter. BecauseChurchill gories.
and Peter had no a priori hypothesis regardingthe re- Perhapsthe most interestingrelationshipstudiedwas
lationshipbetweensamplesize and coefficientalphaand that between the number of items in a scale and the
were unable to explain the obtained relationshippost magnitudeof coefficientalpha.Theoretically,the larger
hoc, it may have simplybeen an anomaly,as the present the number of items in a scale, the more reliable will
researchsuggests.The lack of a significantrelationship be the scale (see, e.g., Nunnally 1978, p. 243). However,

FIGURE3 values is due to the size of the averageinteritem cor-

RELATIONSHIPSBETWEEN NUMBER OF SCALE ITEMSAND relation observedfor nine-item scales, .31. This result
COEFFICIENTALPHAAND AVERAGE INTERITEM suggests that researchersattempting to increase the
CORRELATION magnitude of an alpha coefficient should concentrate
1.0 on the quality of items included in a scale and not sim-
ply on the quantity of items.
0.9 _
In conclusion, this article has documented the mag-
0.8 nitudes of alpha coefficientsobtainedin behavioralre-
a searchoverthe pastthreedecadesand has demonstrated
0.7 0
that, with few exceptions, the magnitudesappearto be
a/f 0.6
more of a function of the construct being measured
than of the characteristicsof the underlyingresearch
0.5- design. It is hoped that this article will provide useful
0.4
informationfor those researchersconstructingor eval-
uating multi-item scales. Moreover,by implication, it
0.3- is hoped that the article will stimulate researchersto
0.2- F
report more information on the scales used in their
studies. Many of the articles and papersexamined for
0.1 this investigationdid not contain sufficientinformation
0..0 0-
about a scale to permit an informedjudgment as to its
2 3 4 5 6 7 8 9 10 potential usefulness or application. For example, the
Numberof scaleitems format of 22 percent of the scales was not identified,
a Coefficient alpha and for 16 percent of the scales examined no mention
F averageinter-iten correlation
was made of the number of item categories.Such sta-
tistics suggest the need for at least minimal reporting
standardsregardingresearchdesigncharacteristicswhen
as Table 3 indicates,the observedrelationshipdeviated publishingscale-relatedresearch.
considerablyfrom theory;only 10 percent of the vari-
ance in coefficient alpha could be attributed to the [ReceivedSeptember1991. Revised October1993.
number of scale items. On average, coefficient alpha Kent B. Monroeservedas editorfor this article.]
does not appearto systematicallyincrease once there REFERENCES
are more than three items in a scale. This could be due
Bangert-Drowns, Robert L. (1986), "Review of Developments
to the heterogeneityin coefficientalpha values within in Meta-analytic Method," Psychological Bulletin, 99
a particularnumberof scaleitems becauseof differences (May), 388-399.
in constructsbeing measured,"noise"due to scale type Bendig, A. W. (1954), "Reliability and the Number of Rating
and format differences,sampling errors,and so forth. Scale Categories," Journal of Applied Psychology, 38
To a large extent, though, the observed relationship (February), 38-40.
probably reflects a decrease in the average interitem Boyle, Gregory J. (1991), "Does Item Homogeneity Indicate
correlationas the numberof items in a scaleis increased. Internal Consistency or Item Redundancy in Psycho-
As its formula indicates, ceteris paribus, coefficient metric Scales?"Personalityand IndividualDifferences,
alpha variesas a joint function of the number of items 12 (March), 291-294.
and the averageinteritemcorrelation.In particular,ac- Churchill, Gilbert A., Jr. (1979), "A Paradigm for Developing
Better Measures of Marketing Constructs," Journal of
cordingto the formula,coefficientalphashouldincrease Marketing Research, 16 (February), 64-73.
as the number of items and the averageinteritem cor- and J. Paul Peter (1984), "Research Design Effects on
relation increase. It is interesting,though, that in the the Reliability of Rating Scales: A Meta-analysis," Journal
presentstudycoefficientalphaand the averageinteritem of MarketingResearch, 21 (November), 360-375.
correlationwere inverselyrelated.As shown in Figure Cortina, Jose M. (1993), "What Is Coefficient Alpha? An Ex-
3, the averageinteritemcorrelationdeclined monoton- amination of Theory and Applications," Journal of Ap-
ically as the number of scale items increased,whereas plied Psychology, 78 (February), 98-104.
coefficient alpha increased slightly as the number of Cox, Eli P., III (1980), "The Optimal Number of Response
scale items increased. Alternatives for a Scale: A Review," Journal of Marketing
Consider the mean alpha coefficient observed for Research, 17 (November), 407-422.
Cronbach,Lee J. (1951), "CoefficientAlpha and the InternalStruc-
three-itemscales,.73. The averageinteritemcorrelation ture of Tests," Psychometrika,16 (September),297-334.
correspondingto this alpha is .47. If this averageinter- Davis, Frederick B. (1964), Educational Measurements and
item correlationwereto be appliedto a nine-item scale, Their Interpretation, Belmont, CA: Wadsworth.
the expected alpha coefficientwould be .89. However, Feldt, Leonard S., David J. Woodruff, and Fathi A. Salih
the observedalpha coefficientwas .80. The difference (1987), "Statistical Inference for Coefficient Alpha," Ap-
betweenthe expectedand the observedalphacoefficient plied PsychologicalMeasurement,11 (March),93-103.

Gerbing, David W. and James C. Anderson (1988), "An Up- Murphy, Kevin R. and Charles 0. Davidshofer (1988), Psy-
dated Paradigm for Scale Development Incorporating chological Testing: Principles and Applications, Engle-
Unidimensionality and Its Assessment," Journal of wood Cliffs, NJ: Prentice-Hall.
MarketingResearch,25 (May), 186-192. Nunnally, Jum C. (1967), Psychometric Theory, 1st ed., New
Guilford, J. P. (1965), Fundamental Statistics in Psychol- York: McGraw-Hill.
ogy and Education, 4th ed., New York: McGraw- (1978), Psychometric Theory, 2d ed., New York:
Hill. McGraw-Hill.
Jacoby, Jacob and Michael S. Matell (1971), "Three-Point Perreault, William D., Jr. and Lawrence E. Leigh (1989),
Likert Scales Are Good Enough," Journal of Marketing "Reliability of Nominal Data Based on Qualitative
Research, 8 (November), 495-500. Judgments," Journal of Marketing Research, 26 (May),
Jenkins, G. Douglas, Jr. and Thomas D. Taber (1977), "A 135-148.
Monte Carlo Study of Factors Affecting Three Indices of Peter, J. Paul (1979), "Reliability: A Review of Psychometric
Composite Scale Reliability," Journal of Applied Psy- Basics and Recent Marketing Practices," Journal of
chology, 62 (November), 392-398. Marketing Research, 16 (February), 6-17.
Kaplan, Robert W. and Dennis P. Saccuzzo (1982), Psycho- and Gilbert A. Churchill, Jr. (1986), "Relationships
logical Testing. Principles, Applications, and Issues, among Research Design Choices and Psychometric
Monterey, CA: Brooks/Cole. Properties of Rating Scales: A Meta-analysis," Journal
Komorita, Samuel S. and William K. Graham (1965), of Marketing Research, 23 (February), 1-10.
"Number of Scale Points and the Reliability of Scales," Peterson, Robert A., Gerald Albaum, and Richard F. Beltra-
Educational and PsychologicalMeasurement,25 (No- mini (1984), "A Meta-analysis of Effect Sizes in Con-
vember), 987-995. sumer Behavior Experiments," Journal of Consumer Re-
Lissitz, Robert W. and Samuel B. Green (1975), "Effect of search, 12 (June), 97-103.
the Number of Scale Points on Reliability: A Monte Carlo Symonds, Percival M. (1928), "Factors Influencing Test Re-
Approach," Journal of Applied Psychology, 60 (Febru- liability," Journal of Educational Psychology, 1.9 (Feb-
ary), 10-13. ruary), 73-87.


Peterson AmetaanalysisofCronbachAlpha

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Peterson AmetaanalysisofCronbachAlpha

Caricato da

Copyright:

Formati disponibili

Journal of Consumer Research, Inc.

A Meta-Analysis of Cronbach's Coefficient Alpha

This content downloaded on Tue, 12 Feb 2013 21:59:56 PM

? 1994by JOURNALOF CONSUMERRESEARCH,Inc. * Vol. 21 * September1994

This content downloaded on Tue, 12 Feb 2013 21:59:56 PM

TABLE 1 pling approach(Bangert-Drowns1986). Churchilland

This content downloaded on Tue, 12 Feb 2013 21:59:56 PM

This content downloaded on Tue, 12 Feb 2013 21:59:56 PM

AMA/ACR Proceedings 1971-1992 Sample 113 .76 .77

aSee text for illustrativejournals.

This content downloaded on Tue, 12 Feb 2013 21:59:56 PM

Attitude 699 .76 .79 ?.010 .69 .86

studyit wasexpectedthatself-administeredscaleswould plied, or whether this could not be determined. (Al-

This content downloaded on Tue, 12 Feb 2013 21:59:56 PM

FIGURE 1 tigated.In addition,approximate95 percentconfidence

This content downloaded on Tue, 12 Feb 2013 21:59:56 PM

aRelationship significant at p < .001.

This content downloaded on Tue, 12 Feb 2013 21:59:56 PM

Peter [ 1984] work), no empirical standards existed

This content downloaded on Tue, 12 Feb 2013 21:59:56 PM

Relationship with coefficient alpha

Research design characteristic Churchilland Peter Present research

Sample size Sample size negatively related to alpha No substantive relationship

This content downloaded on Tue, 12 Feb 2013 21:59:56 PM

FIGURE3 values is due to the size of the averageinteritem cor-

This content downloaded on Tue, 12 Feb 2013 21:59:56 PM

This content downloaded on Tue, 12 Feb 2013 21:59:56 PM

Potrebbero piacerti anche