Sei sulla pagina 1di 1

hs26 v.

2006/06/29 Prn:29/06/2006; 12:35


aid: 26004 pii: S0169-7161(06)26004-8 docsubty: REV

F:hs26004.tex; VTEX/DL p. 1

Handbook of Statistics, Vol. 26


ISSN: 0169-7161
2006 Elsevier B.V. All rights reserved
DOI: 10.1016/S0169-7161(06)26004-8

5
6

Reliability Coefficients and Generalizability Theory

5
6

9
10

Noreen M. Webb, Richard J. Shavelson and Edward H. Haertel

9
10

11

11

12

12

13
14

13

1. Introduction

14

15
16
17
18
19
20
21
22
23
24
25
26
27
28
29

15

When a person is tested or observed multiple times, such as a student tested for mathematics achievement or a Navy machinist mate observed while operating engine room
equipment, scores reflecting his or her performance may or may not agree. Not only may
individuals scores vary from one testing to another, calling into question the defensibility of using only one score for decision-making purposes, but the rankings of individuals
may also disagree. The concern of reliability studies is to estimate the consistency of
scores across repeated observations. Reliability coefficients quantify the consistency
among the multiple measurements on a scale from 0 to 1.
In this chapter we present reliability coefficients as developed in the framework of
classical test theory, and describe how the conception and estimation of reliability was
broadened in generalizability theory. Section 2 briefly sketches foundations of classical
test theory (see the chapter by Lewis for a thorough development of the theory) and focuses on traditional methods of estimating reliability. Section 3 reviews generalizability
theory, including applications and recent theoretical contributions.

30

35
36
37
38
39
40
41
42
43
44
45

18
19
20
21
22
23
24
25
26
27
28
29

31

2. Reliability Coefficients in Classical Test Theory

33
34

17

30

31
32

16

32
33

Classical test theorys reliability coefficients are widely used in behavioral and social
research. Each provides an index of measurement consistency ranging from 0 to 1.00
and their interpretation, at first blush, is relatively straightforward: the proportion of
observed-score variance attributable to true-scores (stable or nonrandom individual differences) (see Lewis chapter for definitions in Classical Test Theory). Coefficients at or
above 0.80 are often considered sufficiently reliable to make decisions about individuals based on their observed scores, although a higher value, perhaps 0.90, is preferred
if the decisions have significant consequences. Of course, reliability is never the sole
consideration in decisions about the appropriateness of test uses or interpretations.
Coefficient alpha (also known as Cronbachs alpha) is perhaps the most widely
used reliability coefficient. It estimates test-score reliability from a single test administration using information from the relationship among test items. That is, it provides an
1

34
35
36
37
38
39
40
41
42
43
44
45

Potrebbero piacerti anche