Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Analysis
Workshop Format
1. What is Multiple Choice Test Item
Analysis?
2. Background information
3. Fundamentals
4. Guided Practice
5. Individual Practice
Background information
What does a test score mean?
Reliability and Validity
Norm-referenced or Criterion-referenced
Norm-referenced or Criterion-referenced
Norm-referenced defines the performance of
test-takers in relation to one another. Use the
frequency distribution and can rank students.
Often used to predict success like GRE or GMAT.
Criterion-referenced defines the performance of
each test taker without regard to the performance
of others. The success is being able to perform a
specific task or set of competencies. Uses a
mastery curve.
Item analysis
How you interpret the results of a test and use
individual item statistics to improve the quality of a
test
Terms used
Standard deviation range above and below the
average score, the more the scores are spread out
the high the SD
Mean average score
N number of items on the test
Raw scores actual scores
Variance = standard deviation squared
Range 0 -1
The HIGHER the difficulty factor the easier the
question is, so a value of 1 would mean all the
students got the question correct and it may be too
easy
Difficulty Factor
Optimal Level is .5
To be able to discriminate between different levels
of achievement, the difficulty factor should be
between .3 and .7
If you want the students to master the topic area,
high difficulty values should be expected.
D=c/n
Guided Practice
What is the D for Items 1-3
Student
Raw
score
Item 1
Item 2
Item 3
Item 4
Item 5
8 a
6 c
6 a
4 a
2 c
8 a
10 a
6 a
8 a
4 a
Difficulty Factor
Item # 1 = .8
Item # 2 = .6
Item # 3 = .4
What does it mean?
Item # 1 = .8 may be too easy
Item # 2 = .6 good
Item # 3 = .4 good
Individual Practice
What is the D for Items 4-5
Student
Raw
score
Item 1
Item 2
Item 3
Item 4
Item 5
8 a
6 c
6 a
4 a
2 c
8 a
10 a
6 a
8 a
4 a
Difficulty Factor
Item # 4 = .5
Item # 5 = .6
What does it mean?
Item # 4 = .5 optimal
Item # 5 = .6 good
Overall, you can say that only item #1 may be too
easy
Review
Purpose - statistically analyze multiple choice
test items to ensure items are effectively
evaluating student learning.
1. Were any of the items too difficult or easy?
(Difficulty index)
2. Do the items discriminate between those
students who really knew the material from those
that did not? (Discrimination index or Point
Biserial)
3. What is the reliability of the exam? (Kuder 20)
More Practice
Item
Difficulty
Discrimination Reliability
#1
.28
.40
.80
#2
.30
.68
.76
#3
.80
.78
.70
#4
.10
-1.00
.20