Multiple Choice Test Item Analysis

Multiple Choice Test Item
Analysis
Facilitator: Sophia Scott
Workshop Format
1. What is Multiple Choice Test Item
Analysis?
2. Background information
3. Fundamentals
4. Guided Practice
5. Individual Practice
What is Multiple Choice Test

Item Analysis?
Statistically analyzing your multiple
choice test items so that you can
ensure that your items are effectively
evaluating student learning.
Background information
What does a test score mean?
Reliability and Validity
Norm-referenced or Criterion-referenced
What does a Test Score Mean?

A score that is a reflection of what you really knew
(true score) and error (things like atmosphere,
nerves etc that modify your true score).
The purpose of a systematic approach to test
design is to reduce error in test taking.
Reliability and Validity

Reliability the test scores are consistent
Test-retest reliability (measure of an individual score
is consistent over time)
Inter-rater reliability (consistency of individual judges
ratings of a performance)
Validity the test measured what it was suppose

to measure.
You want your test to be both

reliable and valid
Norm-referenced or Criterion-referenced
Norm-referenced defines the performance of
test-takers in relation to one another. Use the
frequency distribution and can rank students.
Often used to predict success like GRE or GMAT.
Criterion-referenced defines the performance of
each test taker without regard to the performance
of others. The success is being able to perform a
specific task or set of competencies. Uses a
mastery curve.
Item analysis
How you interpret the results of a test and use
individual item statistics to improve the quality of a
test
Terms used
Standard deviation range above and below the
average score, the more the scores are spread out
the high the SD
Mean average score
N number of items on the test
Raw scores actual scores
Variance = standard deviation squared
Fundamentals of Item Analysis

1. Were any of the items too difficult or easy?
2. Do the items discriminate between those
students who really knew the material from those
that did not?
3. What is the reliability of the exam?
1. Were any of the items too difficult

or too easy?
Use the Difficulty Factor of a question
Proportion of respondents selecting the right answer
to that item
D=c/n
D = difficulty factor
c = number of correct answers
n = number of respondents
Range 0 -1
The HIGHER the difficulty factor the easier the
question is, so a value of 1 would mean all the
students got the question correct and it may be too
easy
Difficulty Factor
Optimal Level is .5
To be able to discriminate between different levels
of achievement, the difficulty factor should be
between .3 and .7
If you want the students to master the topic area,
high difficulty values should be expected.
D=c/n
Guided Practice
What is the D for Items 1-3
Student
Raw
score
Item 1
Item 2
Item 3
Item 4
Item 5
8 a
6 c
6 a
4 a
2 c
8 a
10 a
6 a
8 a
4 a
Difficulty Factor
Item # 1 = .8
Item # 2 = .6
Item # 3 = .4
What does it mean?
Item # 1 = .8 may be too easy
Item # 2 = .6 good
Item # 3 = .4 good
Individual Practice
What is the D for Items 4-5
Student
Raw
score
Item 1
Item 2
Item 3
Item 4
Item 5
8 a
6 c
6 a
4 a
2 c
8 a
10 a
6 a
8 a
4 a
Difficulty Factor
Item # 4 = .5
Item # 5 = .6
What does it mean?
Item # 4 = .5 optimal
Item # 5 = .6 good
Overall, you can say that only item #1 may be too
easy
2. Do the items discriminate between those students

who really knew the material from those that did not?
The Discrimination Index

DI = (a-b) / n
a=response frequency of the High group
b=response frequency of the Low group
n-number of respondents
Point- Biserial Correlation

who really knew the material from those that did
not?
Correlates the test-takers performance on a single
test item with their total score.
Range +1.00 to -1.00
Items which discriminate well are those which
have difficulties between .3 and .7

Positive coefficient means that test-taker who got

the item right generally did well on the test as a
whole, while those who did poorly on the item did
poorly on the test.
Negative coefficient means that the test-taker who
did well on the test missed the item, while those
who did poorly got the item right.
Zero coefficient means that all test-takers got the
item correct or incorrect.

The Discrimination Index Steps

1. Rank test scores from highest to lowest, so the
highest is at the top of the list
2. Define high group (top 27%)
3. Define low group (bottom 27%)
4. Calculate DI= a-b / n
What does it mean?

Point Biserial
Item # 1 = .48
Item # 2 = .43
Item # 3 = .47
Item # 4 = .62
Item # 5 = .83
Item 5 is close to not discriminating
Overall the test does discriminate
3. What is the reliability of the exam

1. Kuder- Richardson 20
2. Kuder-Richardson 21
3. Cronbach alpha
3. What is the reliability of the exam

Range 0-1
Higher value indicates a strong relationship
between items and test
Lower value indicates a weaker relationship
between test item and test
r = n / n-1[s2 + p1q1 / s2 ]
n = number of items on test
s= standard deviation
p1= proportion of correct responses
q1= 1- p1
What does it mean?

Kuder 20
Item # 1 = .88
Item # 2 = .63
Item # 3 = .40
Item # 4 = .76
Item # 5 = .89
Item 3 may not relate as well
Overall the test is reliable
Review
Purpose - statistically analyze multiple choice
test items to ensure items are effectively
evaluating student learning.
1. Were any of the items too difficult or easy?
(Difficulty index)
2. Do the items discriminate between those
students who really knew the material from those
that did not? (Discrimination index or Point
Biserial)
3. What is the reliability of the exam? (Kuder 20)
More Practice
Item
Difficulty
Discrimination Reliability
#1
.28
.40
.80
#2
.30
.68
.76
#3
.80
.78
.70
#4
.10
-1.00
.20
Thank you for your Time
Any Questions or Comments?

Multiple Choice Test Item Analysis

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Multiple Choice Test Item Analysis

Caricato da

Copyright:

Formati disponibili

Multiple Choice Test Item

Facilitator: Sophia Scott

What is Multiple Choice Test

What does a Test Score Mean?

Reliability and Validity

Validity the test measured what it was suppose

You want your test to be both

Fundamentals of Item Analysis

1. Were any of the items too difficult

2. Do the items discriminate between those students

The Discrimination Index

2. Do the items discriminate between those students

2. Do the items discriminate between those students

Positive coefficient means that test-taker who got

2. Do the items discriminate between those students

The Discrimination Index Steps

What does it mean?

3. What is the reliability of the exam

3. What is the reliability of the exam

What does it mean?

Thank you for your Time

Any Questions or Comments?

Potrebbero piacerti anche