Sei sulla pagina 1di 7

Guidelines for Constructing Effective Test Items



Multiple-choice tests measure a variety of learning levels

They are easy to grade


Multiple-choice tests evaluate recognition (choosing an answer) rather than recall (constructing an answer)

They allow for guessing

They are fairly difficult to construct

Guidelines for constructing multiple-choice items

1. Make the content meaningful. Do not test trivial or unimportant facts.

Poor: Skinner developed programmed instruction in

a. 1953

b. 1954 (correct)

c. 1955

d. 1956

Better: Skinner developed programmed instruction in

a. 1930s

b. 1940s

c. 1950s (correct)

d. 1970s

2. Make all alternatives plausible as correct responses. To make sure your alternatives are plausible, define the class of things to which all of the answer choices should


Poor: What is a claw hammer?

a. a woodworking tool (correct)

b. a musical instrument

c. a gardening tool

d. a shoe repair tool

Better: What is a claw hammer?

a. a woodworking tool (correct)

b. a metalworking tool

c. an autobody tool

d. a sheetmetal tool

3. While constructing the alternatives, ask yourself a few questions. How can the learner demonstrate knowledge of the subject? In what sort of circumstances might it be important to understand the concepts? If a learner did not know or understand the concepts or facts, what might be the consequences?

For example, suppose the subject is emergency medical care, and your course design documents tell you it is critical to know how to treat shock. You might ask yourself, “If someone did not know the proper treatment for shock, what steps might that person take?” Write down the answers that occur to you and then add the correct


4. Reduce the length of the alternatives by moving as many words as possible to the stem. The rationale is that additional words in the alternatives have to be read four or five times, in the stem only once.

Poor: The mean is

a. a measure of the average (correct)

b. a measure of the midpoint

c. a measure of the most popular score

d. a measure of the dispersion scores.

Better: The mean is a measure of the

a. average (correct)

b. midpoint

c. most popular score

d. dispersion of scores

5. Construct the stem so that it conveys a complete thought.

Poor: Objectives are

a. used for planning instruction (correct)

b. written in behavioural form only

c. the last step in the instructional design process

d. used in the cognitive but not affective domain

Better: The main function of instructional objectives is

a. planning instruction (correct)

b. comparing teachers

c. selecting students with exceptional abilities

d. assigning students to academic programs


Do not make the correct answer stand out as a result of its phrasing or length.

Poor: A narrow strip of land bordered on both sides of water is called an

a. isthmus (correct)

b. peninsula

c. bayou

d. continent

(Note: Do you see why a would be the best guess given the phrasing?)

Better: A narrow strip of land bordered on both sides by water is called a(n)

7. Avoid overusing always and never in the alternatives. Students who are good test takers quickly learn to avoid those choices.

8. Avoid overusing all of the above and none of the above. When all of the above is used, students can eliminate it simply by knowing that one answer is false. Or they will know to select it if any two answers are true. A better strategy might be to include fewer choices rather than to use all of the above or none of the above.

9. Randomly select the position of the correct answer.



True/false items are fairly easy to write

They are very easy to grade


True/false items can only test for factual information

They allow for a high probability (50%) of guessing the correct answer

They limit assessments to lower levels of learning (knowledge and comprehension)

Guidelines for constructing true/false items

1. Be certain that the statement is entirely true or entirely false.

Poor: A good instructional objective will identify a performance standard. (True/False) (Note: The correct answer here is technically false. However, the statement is ambiguous. While a performance standard is a feature of some “good” objectives, it is not necessary to make an objective good.)

Better: A performance standard of an objective should be stated in measurable terms. (True/False) (Note: The answer here is clearly true.)

2. Convey only one thought or idea in a true/false statement.

Poor: Bloom’s cognitive taxonomy of objectives includes six levels of objectives, the lowest being knowledge. (True/False)

Better: Bloom’s cognitive taxonomy includes six levels of objectives. (True/False) Knowledge is the lowest-level objective in Bloom’s cognitive taxonomy. (True/False)

3. Require learners to write a short explanation of why false answers are incorrect.

4. Incorporate third or fourth choices such as opinion (as opposed to fact) – “sometimes, but not always true”, “cannot be resolved”, etc.



A large amount of material can be condensed to fit in less space

Students have substantially fewer chances for guessing correct associations than on multiple-choice and true/false tests


Matching tests cannot effectively test higher order intellectual skills

Guidelines for constructing matching tests

1. Limit the number of items to a maximum of six or seven. It becomes very confusing for learners to try to match a greater amount.

2. Limit the length of the items to a word, phrase, or brief sentence. In general, make the items as short as possible.

3. Provide one or two extra items (distractors) in the second column. Their inclusion reduces the probability of correct guessing. This also eliminates the situation that may occur in equal-sized lists, where if one match is incorrect, a second match must also be incorrect.



Since the expected answers are specific, scoring can be fairly objective

These tests can test a large amount of content within a given time period


These test items are limited to testing lower-level cognitive objectives, such as the recall of facts

Scoring may not be as straightforward and objective as anticipated

Guidelines for constructing Short-Answer, Fill in the Blank and Completion Test Items

Word the test items so that only one answer is correct. Otherwise, scoring will become more subjective, and, when grades are at issue, arguments with students will become more frequent.

Poor: The first president of the United States was

(Note: The desired answer is George Washington, but students may write “from Virginia”, “a general”, and other creative expressions.)

(two words)

Better: Give the first and last name of the first president of the United States:



Essays can test for higher order intellectual skills and cognitive strategies

They are relatively easy to construct

They allow your learners to display mastery of an entire range of knowledge about a subject

A “short” essay requires a highly focused response

A “long” essay allows the learner more opportunity to express and defend a point of view


The more expansion or divergence allowed, the more difficult the grading is

Because students will have time to write only a few essays, a limited number of concepts or principles relating to a topic can be tested

If the questions asked are not focused, students may stray off the topic or misinterpret the type of response required. Scoring becomes more difficult

Time required for different learners to complete an essay test will vary greatly

Much time and care must be taken when grading so as to be as objective as possible and avoid making personal judgments

Guidelines for constructing essay tests

1. Make the questions as specific and focused as possible.

Poor: Describe the role of instructional objectives in education. Discuss Bloom’s contribution to the evaluation of instruction.

Better: Describe and differentiate between behavioural (Mager) and cognitive (Gronlund) objectives with regard to their (1) format and (2) relative advantages and disadvangaes for specifying instructional intentions.

2. Inform students of the grading criteria and conditions. Will spelling count? How important is organization? Are all parts of the essay worth the same number of points? Can a dictionary be used? Do dates of historic events need to be indicated?

3. Write or outline a model answer. It will help you to focus on content, assign points to key concepts included, and grade more objectively and reliably.

4. Grade essays “in the blind”, that is, without knowing the writers’ identities. When multiple essay questions are required, evaluate a given question for all students before scoring the next essay.

5. Do not give students a choice of essays; have all respond to the same questions.



Problem-Solving Questions are well suited to evaluate higher-level cognitive outcomes

They are generally easy to construct


Scoring is fairly difficult, especially in situations in which there are alternative solution approaches and possible answers

Guidelines for constructing problem-solving questions

1. Specify the criteria for evaluation.

2. Award partial credit or give a separate score for using correct procedures when the final answer is incorrect. On many problems, a careless error may result in a wrong answer, even though the work shown conveys full understanding of the problem

3. Construct a model answer for each problem that indicates the amount of credit to be awarded for work at different stages


Once you have constructed your test items, regardless of the type and the format, ask yourself the following questions:

Do the items truly measure what I am trying to measure?

Will the intent of the items be clear to someone reading it for the first time?

Do my learners have all of the information they need to answer the items?

Is the wording as clear and concise as possible? If not, can the item be revised and still understood?

Is the correct answer clearly correct and up-to-date according to experts in the field?


Morrison, G.R., Ross S.M., Kemp J.E. (2004). Designing Effective Instruction. Fourth Edition Cantor J.A. (2001). Delivering Instruction to Adult Learners. Revised Edition