EDU3: Test & Measurement: Administering, Analyzing, and Improving The Test or Assessment

24/07/18
EDU3: Test & Measurement
ADMINISTERING, ANALYZING,
AND IMPROVING THE TEST OR
ASSESSMENT
Prepared by:
Jan Nicole Juat, RPm, MSc
msnjuat@gmail.com
OUTLINE
•  Assembling the Test
•  Administering the Test
•  Scoring the Test
•  Analyzing the Test
•  Debriefing
•  The Process of Evaluating
Classroom Achievement
1
24/07/18
Assembling the Test

ü Written measurable
q Package the Test objectives
q Reproduce the Test ü Prepared a test
blueprint
ü Written test items
that match your
instructional
Objectives

PACKAGING THE TEST
§  Group Together All Items of Similar Format:
§  Have your true-false items grouped together
§  All completion items together
§  Will enable them to cover more items in a given time
§  TIME-SAVER
§  Arrange Test Items from Easy to hard
§  Building confidence
§  Reducing test anxiety
§  Space the Items for easy reading
§  Provide enough blank space between items so that each
items is distinctly separate from others
§  if too close together, students might perceive a word,
phrase, or line from a preceding or following item as apart
of the item in question
§  It will interferes with a student’s capacity to demonstrate
his or her true ability
2
24/07/18

PACKAGING THE TEST
§  Keep Items & options on the Same Page
§  aggravating to a test-taker to turn the page to
read the options
§  It will minimize the likelihood that the last line or
two of the item will be cut off when you
reproduce the test
§  Position Illustrations Near Descriptions
§  place diagrams, maps, or other supporting
material immediately above the item or items t
which they refer.
§  Check Your Answer Key
§  No pattern

PACKAGING THE TEST
§  Provide Space for Name and Date
§  Check Test Directions
§  Directions should specify:
§  The number of the items to which that apply
§  how to record answers
§  the basis on which to select answers
§  criteria for scoring
§  Proofread the Test
§  Typographical and grammatical errors before
reproducing the test and make any necessary
corrections
§  Reproducing the Test
3
24/07/18

ADMINISTERING THE TEST
o  Maintain a Positive Attitude: Keep the main
purposes of classroom testing in mind - to evaluate
achievement and to provide feedback to yourself
and students
o  Maximize Achievement Motivation: keep your
general statement about the test accurate (Avoid
saying it will be easy or hard). reassure and
motivate students
o  Equalize Advantages: Try to equalize the
advantages testwise students have over non-
testwise students.

o  Avoid Surprises: be sure your students have
sufficient advance notice of a test. Avoid pop
quizzes for elementary and high school students.
o  Clarify the Rules: inform students about time
limits, restroom policy and any special
considerations about the answer sheet before
distribute the tests
o  Rotate Distribution: Alternate beginning test
distribution at the left, right, front and back of the
class. This way, the same person will not always be
last one to receive the test.
4
24/07/18

o  Remind Students to Check their Copies: none has
been omitted, remind them to write their names
o  Monitor Students:
o  inform students about penalties for cheating,
and implement the penalties when cheating
occurs
o  Minimize Distractions: noise
o  Give Time Warnings:
o  Collect Tests Uniformly: indicate whether you want
all papers or only some returned, where they are
to be placed..

SCORING THE TEST
•  Prepare an Answer Key:
•  When constructing the answer key, you should get an
idea of how long it will take your students to complete
the test and whether this time is appropriate for the
time slot you have allocated to the test
•  Check the Answer Key:
•  have a colleague check your answer key to identify
alternative answers or potential problems
•  Score Blindly:
•  keep the student’s name out of the sight to prevent
your knowledge about, or expectations of, the student
from influencing the score
•  Check Machine-Scored Answer Sheets
•  Check Scoring
5
24/07/18

ANALYZING THE TEST
§  Errors in test construction
§  no test you construct will be perfect - it will
include inappropriate, invalid, or otherwise
deficient items
§  “Item Analysis”
§  Quantitative item analysis
§  Qualitative item analysis
2 TYPES OF ITEM ANALYSIS
QUANTITATIVE ANALYSIS QUALITATIVE ANALYSIS
•  A numerical method for analyzing test •  A non-numerical method

items employing student response for analyzing test items not
alternatives or options.
employing student
•  Identifying distractors or response responses, but considering
options that are not doing what they test objectives, content
are supposed to be doing
validity, and technical item
•  ideally suited for examining the quality.
usefulness of multiple-choice formats
•  for norm-referenced tests
6
24/07/18

OTHER TERMINOLOGIES
•  KEY
•  Correct option in a multiple-choice item.
•  DISTRACTOR
•  Incorrect option in a multiple-choice item.
•  DIFFICULTY INDEX (p)
•  Proportion of students who answered the item
correctly.
•  DISCRIMINATION INDEX (D)
•  Measure of the extent to which a test item
discriminates or differentiates between
students who do well on the overall test and
those who do not do well on the overall test.
3 TYPES OF DISCRIMINATION INDEXES (D)
•  POSITIVE DISCRIMINATION INDEX

•  Those who did well on the overall test chose the correct answer for a
particular item more often than those who did poorly on the overall
test.
•  NEGATIVE DISCRIMINATION INDEX

•  Those who did poorly on the overall test chose the correct answer for
a particular item more often than those who did well on the overall
test.
•  ZERO DISCRIMINATION INDEX

•  Those who did well and those who did poorly on the overall test
chose the correct answer for a particular item with equal frequency.
7
24/07/18
ANALYZING MULTIPLE CHOICE ITEMS
•  STEP 1: Compute p; item’s difficulty index
•  Were the students

who answered it
correctly those
who did well on
the overall test?
•  Did the distractors
fooled those who
did well or poorly
on the test?
If more students who do well on the test overall answer an item

correctly—positive discrimination index—that item helps the
overall discrimination ability of the test. If this is true for all items
(they are all positively discriminating), the test will do a good job
of discriminating between those who know their stuff and those
who don’t. To the extent that students who do poorly on the
overall test answer individual items correctly—negative
discrimination index—the test loses its ability to discriminate.
8
24/07/18
ANALYZING MULTIPLE
• STEP 2: Compute CHOICE
D; discrimination index ITEMS
1.  Arrange the papers from highest to lowest score.
2.  Separate the papers into an upper group and a lower group based on total test scores. Do so
by including half of your papers in each group.
3.  For each item, count the number in the upper group and the number in the lower group that
chose each alternative.
4.  Record your information for each item in the following form (the following data are from the
previous example; again the asterisk indicates the keyed option):
5.  Compute D, the discrimination index, by plugging the appropriate numbers into the following
formula:
• STEP 2: Compute D; discrimination index

1.  Arrange the papers from highest to lowest score.
2.  Separate the papers into an upper group and a lower group based on total test scores. Do
so by including half of your papers in each group.
3.  For each item, count the number in the upper group and the number in the lower group
that chose each alternative.
4.  Record your information for each item in the following form (the following data are from the
previous example; again the asterisk indicates the keyed option):
5.  Compute D, the discrimination index, by plugging the appropriate numbers into the
following formula:
9
24/07/18
Difficulty index(p) = 0.60

Discrimination index(D) = 0.267
Test construction experts try to build tests that have

most items between p levels of 0.20 and 0.80, with an
average p level of about 0.50.

When p levels are less than about 0.25, the item is

considered relatively difficult. When p levels are above
0.75, the item is considered relatively easy.

Some experts insist that D should be at least 0.30, while

others believe that as long as D has a positive value, the
item’s discrimination ability is adequate.

10
24/07/18
• STEP 3: Should this be item discriminated?

• STEP 4: Should any be distractor be eliminated or
modified?
• STEP 5: Check for miskeying, ambiguity, or guessing.
Among the choices for the upper group only
•  Miskeying
•  More choose the distractor
than key
•  Guessing
•  Equal spread of choices
across options
•  Ambiguity
•  Equal number choose one
distractor and the key
11
24/07/18

EXERCISE
1.  What is the difficulty level?

2.  What is the discrimination index?
3.  Should this item be eliminated?
4.  Should any distractors be eliminated?
QUALITATIVE ANALYSIS
•  Quantitative item analysis is useful but limited.
•  It points out items that have problems but doesn’t tell us
what the problems are.
•  It is possible that an item that fails to measure or match an
instructional objective could have an acceptable difficulty
level, an answer that discriminates positively, and
distractors that discriminate negatively.
•  Quantitative item analysis is fallible.
•  To do a thorough job of test analysis, one must use a
combination of quantitative and qualitative item analysis,
and not rely solely on one or the other. In other words,
there is no substitute for carefully scrutinizing and editing
items and matching test items with objectives.
12
24/07/18

ITEM ANALYSIS MODIFICATION FOR CRITERION-
REFERENCED TEST
•  Using Pre- and Post-test as Upper and Lower groups

•  By studying the difference between the difficulty (p) levels for each
item at the time of the pre- and post-tests, we can tell if this is
happening.
•  At pretest, the p level should be low (e.g., 0.30 or lower), and at
post-test, it should be high (e.g., 0.70 or higher).
•  The pretest results for an item as the lower group (L) and post-test
results for the item as the upper group (U), and then we can perform
the quantitative item analysis procedure.

EXAMPLE 1: Analyze the following results
Numbers of students choosing option (n = 25)
Step 1: Compute p levels for both tests.

Step 2: Determine the discrimination index (D) for the key.
Step 3: Determine whether each option discriminates
negatively.
13
24/07/18


14
24/07/18

EXAMPLE 2: Analyze the results in the following table
1. Is there a substantial increase in p value (0.40 or more)

between pre- and post-test?
2. Was D greater than 0.40 for the key?
3. Did all distractors discriminate negatively?
DEBRIEFING GUIDELINES
ü Discuss Problem Items
ü Discuss any items you found problematic in scoring the test. This sets the
stage for rational discussion and makes for more effective consideration
of the item(s) in question. Also, you are more likely to have the attention
of the students they received.
ü Listen to Student Reactions
ü Ask for student reactions to your comments and listen to their reactions.
Again, you are setting the stage for rational discussion of the test by
letting the students know you are interested in their feedback.
Remember, your goal is to improve the validity and reliability of your test
by improving on its weaknesses.
ü Avoid on-the-Spot Decisions
ü Tell your students that you will consider their comments, complaints, and
suggestions, but you will not make any decisions about omitting items,
partial credit, extra credit, and so forth until you have had time to study
and think about the test data. You may want to make it clear that
soliciting their comments is only for the purpose of preparing the next
test, not for reconsidering grades for the present test.
15
24/07/18
DEBRIEFING GUIDELINES
ü Be Equitable with Changes

ü If you decide to make changes, let your students know that any changes
in scoring will apply to all students, not just those who raise objections.
After handing back answer sheets or grades, do the following.
ü Ask Students to Double-Check
ü Ask students to double-check your arithmetic, and ask any who think
clerical errors have been made to see you as soon as possible. Here you
are presenting yourself as human by admitting that you can make errors.
ü Ask Students to Identify Problems
ü If time permits, ask students to identify the items they find problematic
and why. Make note of the items and problems. Such items may then be
discussed or worked into some new instructional objectives.
16

EDU3: Test & Measurement: Administering, Analyzing, and Improving The Test or Assessment

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

EDU3: Test & Measurement: Administering, Analyzing, and Improving The Test or Assessment

Caricato da

Copyright:

Formati disponibili

24/07/18

EDU3: Test & Measurement

EDU3: Test & Measurement

EDU3: Test & Measurement

Assembling the Test

EDU3: Test & Measurement

EDU3: Test & Measurement

EDU3: Test & Measurement

EDU3: Test & Measurement

EDU3: Test & Measurement

EDU3: Test & Measurement

EDU3: Test & Measurement

EDU3: Test & Measurement

2 TYPES OF ITEM ANALYSIS

QUANTITATIVE ANALYSIS QUALITATIVE ANALYSIS

• A numerical method for analyzing test • A non-numerical method

EDU3: Test & Measurement

3 TYPES OF DISCRIMINATION INDEXES (D)

• POSITIVE DISCRIMINATION INDEX

• NEGATIVE DISCRIMINATION INDEX

• ZERO DISCRIMINATION INDEX

ANALYZING MULTIPLE CHOICE ITEMS

• STEP 1: Compute p; item’s difficulty index

• Were the students

ANALYZING MULTIPLE CHOICE ITEMS

If more students who do well on the test overall answer an item

• STEP 2: Compute D; discrimination index

ANALYZING MULTIPLE CHOICE ITEMS

Difficulty index(p) = 0.60

Test construction experts try to build tests that have

When p levels are less than about 0.25, the item is

Some experts insist that D should be at least 0.30, while

ANALYZING MULTIPLE CHOICE ITEMS

• STEP 3: Should this be item discriminated?

EDU3: Test & Measurement

EDU3: Test & Measurement

1. What is the difficulty level?

EDU3: Test & Measurement

EDU3: Test & Measurement

• Using Pre- and Post-test as Upper and Lower groups

EDU3: Test & Measurement

Step 1: Compute p levels for both tests.

EDU3: Test & Measurement

EDU3: Test & Measurement

EDU3: Test & Measurement

1. Is there a substantial increase in p value (0.40 or more)

ü Be Equitable with Changes

Potrebbero piacerti anche

•  A numerical method for analyzing test •  A non-numerical method

•  POSITIVE DISCRIMINATION INDEX

•  NEGATIVE DISCRIMINATION INDEX

•  ZERO DISCRIMINATION INDEX

•  STEP 1: Compute p; item’s difficulty index

•  Were the students

• STEP 2: Compute D; discrimination index

• STEP 3: Should this be item discriminated?

1.  What is the difficulty level?

•  Using Pre- and Post-test as Upper and Lower groups

ü Be Equitable with Changes