Sei sulla pagina 1di 3

COMPARING NORM-REFERENCED AND

CRITERION-REFERENCED TESTS
As you may have guessed, criterion-referenced tests must be very
specific if they are to yield information about individual skills. This is
both an advantage and a disadvantage. Using a very specific test enables
you to be relatively certain that your students have mastered or failed to
master the skill in question. The major disadvantage of criterion-refer-
enced tests is that many such tests would be necessary to make
decisions about the multitude of skills typically taught in the average
classroom.
The norm-referenced test, in contrast, tends to be general. t
measures a variety of specific and general skills at once, but it fails to
measure them thoroughly. Thus you are not as sure as you would be with a
criterion-referenced test that your students have mastered the individual skills
in question. !owever, you get an estimate of ability in a variety of skills in a
much shorter time than you could through a battery of criterion-
referenced tests. "ince there is a trade-off in the uses of criterion- and
norm-referenced measures, there are situations in which each is
appropriate. #etermining the appropriateness of a given type of test
depends on the purpose of testing.
$inally, the difficulty of items in %&Ts and '&Ts also differs. n the
%&T, items vary in level of difficulty from those that almost no one
answers correctly to those that almost everyone answers correctly. n the
'&T, the items tend to be equivalent to each other in difficulty. $ollowing a
period of instruction, students tend to find '&T items easy and answer most
correctly. n a '&T, about ()* of the students completing a unit of
instruction are e+pected to answer each item correctly, whereas in an %&T
about ,)* are e+pected to do so. Table -.. illustrates differences between
%&Ts and '&Ts.
NORM- AND CRITERION-REFERENCED TESTS AND
CONTENT VALIDITY EVIDENCE
TABLE 4.1 Comparing NRTs an CRTs
80%
Standards indicative
NRT Dim!n"ion
Average number of
students who get an item
right
Compares a student's
performance to
Breadth of content
sampled
Comprehensiveness of
content sampled
Variability
tem construction
!eporting and interpreting
considerations
CRT
"0%
#he performance of
other students$
Broad% covers many
ob&ectives$
Shallow% usually
one or two items
per ob&ective$
Since the
meaningfulness of a
norm'referenced
score basically
depends on the
relative position of
the score in
comparison with
other scores% the
more variability
or spread of scores%
the better$
tems are chosen to
promote variance or
spread$ tems that are
(too easy( or (too
hard( are avoided$
)ne aim is to produce
good (distractor
options$(
*ercentile ran+ and
standard scores used
,relative ran+ings-$.
of mastery$
/arrow% covers a few
ob&ectives$
Comprehensive%
usually three or more
items per ob&ective$
#he meaning of the
score does not
depend on
comparison with other
scores0 t flows
directly from the
connection between
the items and the
criterion$ Variability
may be minimal$
tems are chosen to
reflect the criterion
behavior$
1mphasis is
placed upon
identifying the
domain of relevant
responses$
/umber
succeeding or
failing or range of
acceptable
performance used
,e$g$% 20%
proficiency
achieved% or 80% of
class reached 20%
proficiency-$
.3or a further discussion of percentile ran+ and standard scores% see Chapters 45 and 46$

Potrebbero piacerti anche