Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
• Item analysis gives you a way to exercise additional quality control over your
test. Well specified learning objectives and well constructed items give you a
head start on the process, but item analyses can give you feedback on how
successfully you were.
• Items can be analyzed:
a. Qualitatively – in terms of its content, form and validity. This also includes its
effective writing procedures.
b. Quantitatively – in terms of their statistical properties. This also includes
measurement of item difficulty and item discrimination.
• Test can be improved through the section, substitution and revision of test
items.
• Item Difficulty - determines the percentage (or proportion) of persons
who answered the test items correctly. It is the relative frequency with
which examinees choose the correct response. It commonly known as
the p value which ranges from 0.0 to 1.0.
• Example: A word that is correctly defined by 70% of the standardization
sample (p=.70) is regarded as easy. Than one that is correctly defined
by only 15% (p=.15).
• One of the basic rule in item difficulty is to arrange items in order of
difficulty so that test takers begin with relatively easy items and
proceed to items of increasing difficulty.
• This arrangement gives test takers the confidence in approaching the
test.
• Also reduces wasting the test takers time on items beyond their ability.
• Major reason for measuring item difficulty is to choose items suitable
difficulty level.
• If no one passes the item, it is then considered as excess baggage. It
will not provide information about individual differences. The same is
true of everyone passes. Since those excess baggage does not affect
the variability of test scores, they contribute nothing to the reliability
and validity of the test scores.
• The formula for item difficulty is:
Difficulty = # of test takers who answered an item correctly
_______________________________________ x 100
Total # tested
• Example: First item was administered to 25 students and let us assume that
23 students answered the item correctly.
• Difficulty = 23 / 25 x 100 = 92%
• Second item was administered to the same group and 14 students answered
the item correctly.
• Difficulty = 14 / 25 x 100 = 56%
• This means that item # 1 is easier than item # 2 since there was a higher
percentage of students who got the correct answer in item # 1 than in item #
2.
• For Criterion – reference test (post testing) - with their emphasis on mastery-
testing, many items on an exam form will have p-values of .9 or above.
• For Norm – referenced test (pre- testing) - are designed to be harder overall
and to spread out the examinees' scores. Thus, many of the items on an NRT
will have difficulty indexes between .4 and .6.
• Distribution of Test Scores
• The difficulty of items as a whole is of course directly dependent on
the difficulty of the test items that make up the test.
• A distribution of the test scores that is clearly skewed
• Ex. A. Piling at the lower end of the scale.
• This means that lacking of items that are easy and most are difficulty
to answer.
• The test takers would normally
obtained zero or near zero scores
in the test items.
• B. Piling at the Upper lower end of the scale
• Most test takers obtained nearly perfect scores
• It is impossible to measure individual differences
• Most items are easy
.90 and Excellent reliability; at the level of the best standardized tests
above
.70 – .80 Good for a classroom test; in the range of most. There are
probably a few items which could be improved.
.50 – .60 Suggests need for revision of test, unless it is quite short (ten or
fewer items). The test definitely needs to be supplemented by
other measures (e.g., more tests) for grading.