Sei sulla pagina 1di 24

CHAPTER I

INTRODUCTION

The paper draws on those foundations and tools to begin the process of
designing test or revising ting tests. To start that process we need some critical
questions: what is the purpose of the test, what are the objectives of the test, how will
the test specifications reflect both the purpose and the objectives, how will the test tasks
be selected and the separate items arranged, and what kind of scoring, grading, and
feedback.
Establishing appropriate objectives involves a number of issues, ranging from
relatively simple ones about forms and functions covered in a course unit to much more
complex ones about constructs to be operationalized in the test. Included here are
decisions about what language abilities are to be assessed.
To evaluate or design a test, we must make sure that the objectives are
incorporated into a structure that appropriately weights the various competencies being
assessed. The tasks that the test-takers must perform need to be practical. They should
also achieve content validity by presenting tasks that mirror those of the course or
segment there of being assessed. Further, they should be able to be evaluated reliably
by the teacher or scorer. The tasks themselves should strive for authenticity, and the
progression of tasks ought to be biased for best performance.
Test vary in the form and function of feedback, depending on their purpose. Or
every test, the way results are reported is an important consideration. Under some
circumstances may require that a teacher offer substantive wash back to the learners.
In this written, you will draw on the foundation and tools to begin the process
of designing test or revising existing test. To start that process, you need to ask some
critical questions:

1. What is the purposes of the test? Why am I creating this rest or why was it
created by someone else? For an evaluation of overall proficiency? To places

1
2

students into a course? To measure achievement within a course? Once you


have established the major purpose of a test, you can determine its objectives.
2. What are objectives of the rest? What specifically am I trying to find out?
Establishing appropriate objectives involves a number of issues, ranging from
relatively simple ones about forms and function covered in a course unit to
much more complex ones about constructs to be operationalized in the test.
Included here decisions about what language abilities are to be assessed.
3. How will the test specifications reflect both the purpose and objectives? To
evaluate or design a test, you must make sure that objectives are incorporated
into a structure that appropriately weights the various competencies being
assessed.
4. How will the test task be selected and separate items arranged? The task that
the task-taker must perform need to be practical in the ways defined in the
previous chapter. They should also achieve content validity by presenting tasks
that mirror those of the course being assessed. Further, they should be able to
be evaluated reliably by teacher or scorer. The tasks themselves should strive
for authenticity, and the progression of tasks ought to be based for the best
performance
3

CHAPTER II
DISCUSSION

A. Test Types
There are many kinds of tests; each test has specific purpose and a particular
criterion to be measured. This written will explain about kinds of tests based on specific
purpose, response, orientation and the way to test, and score interpretation.

1. Based on specific purpose


a. Proficiency Test
A proficiency test is not limited to any one course, curriculum, or single
skill in the language; rather, it tests overall ability. Proficiency test are almost
always summative and norm-referenced. They provide results in the form of a
single score which is a sufficient result for the gate-keeping role they play of
accepting or denying someone passage into the next stage of a journey. And
because they measure performance against a norm, with equated scores and
percentile ranks taking on paramount importance, they are usually not equipped
to provide diagnostic feedback.
A typical example of a standardized proficiency test is The Test of
English as a Foreign Language (TOEFL) produced by the Educational Testing
Service. The TOEFL consists of sections on Listening comprehension, structure
or grammatical accuracy, reading comprehension, and written expression.

Example of Language Proficiency Test


A proficiency test is not limited to any one course, curriculum, or single
skill in the language; rather, it tests overall ability. A typical example of
standardized proficiency test is the Test of English as a Foreign Language
(TOEFL) produced by the Educational Testing Service. A key issue in
testing proficiency is how the constructs of language ability are specified.
4

1. The committee has met twice and ….


A. they reached a final decision.
B. a final decision was reached.
C. its decision was reached.
D. it has reached a final decision.
2. Brenda's score on the test is the highest in class ….
A. She should study hard last night.
B. She should have studied hard last night.
C. She must have studied hard last night.
D. She had to study hard last night.

b. Diagnostic Test
Diagnostic test is used to identify students' strengths and weaknesses.
Another purpose is to diagnose specific aspects of a language. A test in
pronunciation, for example, might diagnose the phonological features of
English that are difficult for learners and should therefore become part of
curriculum. Usually, such tests offer a checklist of features for the administrator
(often the teacher) to use in pinpointing difficulties. For example, a writing
diagnostic test would first elicit a writing sample of the students. Then the
teacher would identify the organization, content, spelling, grammar, or
vocabulary of their writing. Based on that identification, teacher would know
the needs of students that should have special focus.
c. Placement Test
The purpose of placement test is to place a student into a particular level
or section of a language curriculum or school. It usually includes a sampling of
the material to be covered in the various courses in a curriculum. A student's
performance on the test should indicate the point at which the student will find
material neither too easy nor too difficult. Placement tests come in many
varieties: assessing comprehension and production, responding through written
and oral performance, multiple choice, and gap filling formats. One of the
5

examples of placement tests is the English as a Second Language Placement


Test (ESLPT) at San Francisco State University.

Part one: Straightforward Beginner and Elementary Placement test


The Straightforward Beginner and Elementary Placement test has been
designed to help you decide whether the Straightforward Beginner course
would be suitable for your students or whether they would qualify for
using the Straightforward Elementary Course.
The Straightforward test has 50 questions, each worth one point. The first
40 are grammar questions and the final 10 are vocabulary questions. The
conversion chart below has been designed to assist you in making your
decision but please note, however, that these bandings are a guide.
Total score Level
0-35 Beginner
36-50 Elementary

This test can also be used to diagnose the grammar of the Beginner level
that your students need clarification on.
Grammar
1. …. is your name?
A. How
B. Who
C. What
D. Where
2. ….? I’m from Italy.
A. Where are you from?
B. Where you are from?
C. Where from you are?
D. From where you are?
d. Achievement Test
The purpose of achievement tests is to determine whether course
objectives have been met with skills acquired by the end of a period of
instruction. Achievement tests should be limited to particular material
addressed in a curriculum within a particular time frame. Achievement tests
belong to summative because they are administered at the end on a unit/term of
6

study. It analyzes the extent to which students have acquired language that have
already been taught.

Achievement tests range from five or ten-minute quizzes to three hour final
examinations, with an almost in finite variety of item types and formats.
Here is the outline for a mid-term examination offered at the high
intermediate level of an intensive English program in the US.
Section A Vocabulary
Part 1 (5 items) : Match words and definitions
Part 2 (5 items) : use the words in a sentence
Section B Grammar
(10 sentences) : error detection (Underline or circle the error)
Section C Reading Comprehension
(2 one paragraph passage) : Four short - answer items for each
Section D Writing
Respond to a two-paragraph article on Native American culture

e. Language Aptitude Test


The purpose of language aptitude test is to predict a person's success to
exposure to the foreign language. According to John Carrol and Stanley Sapon
(the authors of MLAT), language aptitude tests does not refer to whether or not
an individual can learn a foreign language; but it refers to how well an
individual can learn a foreign language in a given amount of time and under
given conditions. In other words, this test is done to determine how quickly and
easily a learner learn language in language course or language training program.
Two standardized aptitude tests have been used course or in the United States:
a. The Modern Language Aptitude Test (MLAT)
b. The Pimsleur Language Aptitude Battery (PLAB)
The Modern Language Aptitude Test (MLAT) Task:
a. Number Learning
7

b. Phonetic Script
c. Spelling clues
d. Words in sentences
e. Paired associates

2. Based on response
There are two kinds of tests based on response. They are objective and
subjective tests.
a. Objective Test
Objective test is a test in which learners ability of performance are measure using
specific set of answer, means there are only two possible answer, right and
wrong. In other word, the score is according to right answer. Type of objective
test includes multiple choice tests, true or false test, matching and problem based
questions.
b. Subjective Test
Subjective test is a test in which the learners ability or performance are judged
by examiners’ opinion and judgment. The example of subjective test is essay
and short answer.

Advantages and Disadvantages of Commonly Used Types of Objective Test

Type of Test Advantages Disadvantages


True or False Many items can be administered in a Limited primarily to testing
relatively short time. Moderately easy knowledge of information. Easy to
to write and easily scored. guess correctly on many items,
even if material has not been
mastered.
Can be used to assess a broad range Difficult and time consuming to
of content in a brief period. Skillfully write good items. Possible to
written items can be measure higher assess higher order cognitive
8

order cognitive skills. Can be scored skills, but most items assess only
quickly. knowledge. Some correct answer
can be guesses.
Matching Items can be written quickly. A broad Higher order cognitive skills
range of content can be assessed. difficult to assess.
Scoring can be done efficiently.

Advantages and Disadvantages of Commonly Used Types of Subjective Test

Type of Test Advantages Disadvantages


Short Answer Many can be administered in a brief Difficult to identify defensible
amount of time. Relatively efficient criteria for correct answers.
to score. Moderately easy to write Limited to questions that can be
items. answered or completed in a few
words.
Essay Can be used to measure higher order Time consuming to administer
cognitive skills. Easy to write and score. Difficult to identify
questions. Difficult for respondent to reliable criteria for scoring. Oanly
get correct answer by guessing. a limited range of content can be
sampled during any one testing
period.

3. Based on orientation and the way to test


Language testing is divided into two types based on the orientation. They are
language competence test and performance language test. Language competence
test is a test that involves componens of language such as vocabulary, grammar,
and pronunciation while performance test is a test that involve the basic skills in
english that are listening, speaking, reading, and writing. Moreover language
testingis also divided into two types based on the way to test. They are direct testing
and indirect testing. Direct testing is a test that the process to elicit students
9

competences uses basic skill, such as listening, speaking, reading or writing while
indirect language testing is a test that the process to elicit students competences
does not use basic skills.
From the explanation above, language testing can be divided into four types based
on orientation and the way to test. They are direct competences test, indirect
competences test, direct performance test, and indirect performance test
1.1 Direct competences tests
The direct competences test is a test that focuses on measuring students’
knowledge about language component, like grammar or vocabulary, which the
elicitation uses one of the basic skills, listening, speaking, reading or writing.
For the example, a teacher wants to know about students’ grammar knowledge.
The teacher asks the students to write a letter to elicit students’ knowledge in
grammar.
1.2 Indirect competence test
The indirect competence test is a test that focuses on measuring students’
knowledge about language component, like grammar or vocabulary, which the
elicitation does not use one of the basic skills, listening, speaking, reading, or
writing. The elicitation in this test uses other ways, such as multiple choice. For
example , the teacher wants to know about students’ grammar knowledge. The
teacher gives a multiple choice test to measure students knowledge in grammar.
2.1 Direct performance test
Direct performance test is a test that focuses on measuring students’ skill
in listening, speaking, reading and writing that the elicitation is through
direct communication for example, the teacher wants to know the students’
skill in writing, the teacher asks the students to write a letter a short story.
2.2 Indirect performance test
Indirect performance test is a test that focuses on measuring students’ skill
in listening, speaking, reading, and writing that the elicitation does not use
the basic skill. For example, the teacher wants to measure the students’
10

skill in listening. The teacher gives some pictures and asks the student to
arrange the picture into correct order based on the story that they listen to.

4. Based on score interpretation


They are two kinds of tests based on score interpretation. They are norm-
referenced test and criterion-referenced test.
1. Norm-Referenced Test
Norm-referenced test are designed to highlight achievement differences
between and among studens to produce a dependable rank order of students
across a continuum of achievement from high achievers to low achievers
(stiggins, 1994). School system might want to classify students in this way so
that they can be properly placed in remedial or gifted programs. The content of
norm-referenced tests is selected according to how well it ranks students from
high achievers to low. In other words, the content selected in norm-referenced
tests is chosen by how well it descriminates among students. A student’s
performance on an norm referenced test is interpreted in relation to the
performance of a large group of similar students who took the test when it was
first normed. For example, if a student receives a percentile rank score on the
total test of 34, this means that he or she performed as well or better than 34%
of the students in the norm group. This type of information can useful for
deciding whether or not students need remedial assistance or is a candidate for
a gifted program. However, the score gives little information about what the
student actually knows or can do.
2. Criterion-Referenced Test
Criterion-referenced tests determine what tests takers can do and what they
know, not how they compare to others (Anastasi, 1988). Criterion-referenced
tests report how well students are doing relative to a pre-determined
performance level on a specified set of educational goals or outcomes included
in the school, district, or state curriculum. Educators may choose to use a
11

criterian-referenced test when they wish to see how well students have learned
the knowledge and skills which they are expected to have mastered. This
information of learning the desired curriculum and how well the school is
teaching that curriculum. The content of a criterion-referenced test is
determined by how well it matches the learning outcomes deemed most
important. In other words, the content selected for the criterion-standard tests
is selected on the basis of its significance in the curriculum. Criterion-
referenced tests give detailed information about how well a student has
performed on each of the educational goals or outcomes included on that test.

B. Some Practical Steps to Test Construction

The descriptions of types of tests in the preceding section are intended to help
you understand how to answer the first question posed in this chapter. What is the
purpose of the test? It is unlikely that you would be asked to design an aptitude test or
a proficiency test, but for the purposes of interpreting those tests, it is important that
you understand their nature. However, your opportunities to design placement,
diagnostic, and achievement tests -especially the latter- will be plentiful. In the
remainder of this chapter, we will explore the four remaining questions posed at the
outset, and the focus will be on equipping you with the tools you need to create such
classroom-oriented tests.

You may think that every test you devise must be a wonderfully innovative
instrument that will garner the accolades of your colleagues and the admiration of your
students. Not so. First, new and innovative testing formats take a lot of effort to design
and a long time to refine through trial and error. Second, traditional testing techniques
can, with a little creativity, conform to the spirit of an interactive, communicative
language curriculum. Your best tack as a new teacher is to work within the guidelines
of accepted, known, traditional testing techniques. Slowly, with experience, you can
12

get bolder in your attempts. In that spirit, then, let us consider some practical steps in
constructing classroom tests.

1. Assessing clear, unambiguous objectives


To know the purpose of the test you are creating, you need to know as
specifically as possible what it is you want to test. By taking a careful look at
everything that you think your students should "know" or be able to "do" based on
the material that the students are responsible for is a good way to approach a test.
In other words, examine the objectives for the unit you are teaching. The example
of the objectives outlined can be seen below.

Form-focused objectives (listening and speaking)


Students will
3. recognize and produce tag questions, with the correct grammatical form
and final intonation pattern, in simple social conversation.
4. recognize and produce wh-information questions with correct final
intonation pattern.
Communication skills (speaking)
Students will
5. state completed actions and events in a social conversation.
6. ask for confirmation in a social conversation.
7. give opinions about an event in a social conversation.
8. produce language with contextually appropriate intonation, stress, and
rhythm.
Reading skills (simple essay or story)
Students will
9. recognize irregular past tense of selected verbs in a story or essay.
Writing skills (simple essay or story)
13

Students will
10. write a one-paragraph story about a simple event in the past.
11. use conjunctions so and because in a statement of opinion.

You may find, in reviewing the objectives of unit or a course, that you
cannot possibly test each one. You will then need to choose a possible subset of the
objectives to test.
2. Drawing up test specifications
In the unit discussed above, your specifications will simply comprise (a) a
broad outline of the test, (b) what skills you will test, and (c) what the items will
look like.

Here is the example of test specifications.

Speaking (5 minutes per person, previous day)

Format: oral interview, T and S


Task: T asks questions of S (objectives 3, 5; emphasis on 6)

Listening (10 minutes)

Format: T makes audiotape in advance, with one other voice on it


Tasks: a.5 minimal pair items, multiple-choice (objective 1)
b.5 interpretation items, multiple-choice (objective 2)

Reading (10 minutes)

Format: cloze test items (10 total) in a story line


Tasks: fill-in blanks (objective 7)

Writing (10 minutes)

Format: prompt for a topic: why I liked/didn't like a recent TV sitcom


Task: writing a short opinion paragraph (objective 9)
14

These informal, classroom-oriented specifications give you an indication of


the topics (objectives) you will cover, the implied elicitation and response formats
for items, the number of items in each section, and the time to be allocated for each.

3. Devising test tasks


You oral interview comes first, and so you draft questions to conform the
accepted pattern of oral interviews. You begin and end with nonscored items
(warm-up and wind down) designed to set students at ease, and then sandwich
between them items intended to test the objective (level check) and a little beyond
(probe).

Oral interview format

A. Warm-up: questions and comments


B. Level-check questions (objectives 3, 5, and 6)
1. Tell me about what you did last weekend.
2. Tell me about an interesting trip you took in the last year.
3. How did you like the TV show we saw this week?
C. Probe (objectives 5, 6)
1. What is your opinion about______________? (news event)
2. How do you feel about_____________? (another news event)
D. Wind-down: comments and reassurance

Test items, first draft


Listening, Part A (sample item)
Directions: Listen to the sentence (on the tape). Choose the sentence on
your test page that is closest in meaning to the sentence you heard.
Voice: They sure made at that party, didn't they?
15

S reads: a. They didn't make a mess, did they?


b. They did make a mess, didn't they?
Listening,Part B (sample item)
Directions: Listen to the question (on the tape). Choose the sentence on
your test page that is the best answer to the question.
Voice: Where did George go after the party last night?
S reads: a. Yes, he did.
b. Because he was tired.
c. To Elaine's place for another party.
d. He went home around eleven o'clock
Reading (sample items)
Directions: Fill in the correct tense of the verb (in parentheses) that
should go in each blank.
Then in the middle of this loud party they (hear)__________ the
loudest thunder you have ever heard! And then right away lightning
(strike)__________ right outside their house!
Writing
Directions: Write a paragraph about what you liked or didn't like
about one of the characters at the party in the TV sitcom we saw.
4. Designing Multiple-Choice Test Items

Based on http:www.ryerson.ca/lt/resources, when determining the best option for


test design, consider the pros and cons of using multiple-choice questions.
Strengths of Multiple-Choice Items:
a. Versatility in measuring all levels of cognitive skills.
b. Permit a wide sampling of content and objectives.
c. Provide highly reliable test scores
d. Can be machine-scores quickly and accurately
16

Limitations of Multiple-Choice Items:


a. Difficult and time-consuming to construct.
b. Depend on students reading skills instructor's writing ability
c. Ease of writing low-level knowledge items leads instructors to neglect
writing items to test higher-level thinking
d. May encourage guessing
Parts of a Multiple-Choice Question
A traditional multiple-choice question (or item) is one in which a student
chooses one answer from a number of choices supplied. A multiple-choice
question consists of:
a. A stem: the text of the question
b. Options: the choices provided after the stem
c. The key: the correct answer in the list of options
d. Distracters: the incorrect answer in the list of options
Since there will be occasions when multiple-choice items are appropriate,
consider the following four guidelines for designing multiple-choice items for
both classroom-based and large-scale situations (adapted from Gronlund, 1998:
60-75, Brown, 1996: 54-57 in Brown, 2004: 56-58).

4.1 Design each item to measure a specific objective


Multiple-choice item, revised
Voice: Where did George go after the party last night?
S reads: a. Yes, he did
b. Because he was tired.
c. To Elaine's place for another party.
d. Around eleven o'clock.

The specific objective being tested here is comprehension of wh-questions.


Distractor (a) is designed to ascertain that the student knows the difference
17

between an answer to a wh-question and a yes/no question. Distractors (b) and


(d), as well as the key opposed to why and when item (c), test comprehension
of the meaning of where as The objective has been directly addressed.
4.2 State both stem and options as simply and directly as possible
We are sometimes tempted to make multiple-choice items too wordy. a good
rule of thumb is to get directly to the point. Here is an example.

Multiple-choice cloze item flawed


My eyesight has really been deteriorating lately. I wonder if I need glasses.
I think I'd better go to the ________________________ to have my eyes
checked.
a. pediatrician
b. dermatologist
c. optometrist

You might argue that the first two sentences of this item give it some
authenticity and accomplish a bit of schema setting. But if you simply want a
student to identify the type of medical professional who deals with eyesight
issues, those sentences are superfluous. Moreover, by lengthening the stem, you
have introduced a potentially confounding lexical item, deteriorate, that could
distract the student unnecessarily.
4.3 Make certain that the intended answer is clearly the only correct one.
In the proposed unit test described earlier, the following item appeared in the
original draft:

Multiple choice item, flawed


Voice: Where did George go after the party last night?
S reads: a. Yes, he did,
b. Because he was tired.
18

c. To Elaine's place for another party.


d. d. Around eleven o'clock.

A quick consideration of the distractor (d) reveals that it is a plausible answer,


along with the intended key, (c). Eliminating unintended possible answers is
often the most difficult problem of designing multiple-choice items. With only
a minimum of context in each stem, a wide variety of responses may be
perceived as correct.

4.4 Use item indices to accept, discard, or revise items.


The appropriate selection and arrangement of suitable multiple-choice items on
a test can be accomplished by measuring items against three indices: item
facility (or item difficulty), item discrimination (sometimes called item
differentiation), and distractor analysis.
4.4.1 Item Facility (or IF) is the extent to which an item is easy or
difficult for the proposed group of test-takers. IF simply reflects the
percentage of students answering the item correctly. The formula
looks like this:

# 𝑜𝑓 𝑆𝑠 𝑎𝑛𝑠𝑤𝑒𝑟𝑖𝑛𝑔 𝑡ℎ𝑒 𝑖𝑡𝑒𝑚 𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑙𝑦


𝐼𝐹 =
𝑇𝑜𝑡𝑎𝑙 # 𝑜𝑓 𝑆𝑠 𝑟𝑒𝑠𝑝𝑜𝑛𝑑𝑖𝑛𝑔 𝑡𝑜 𝑡ℎ𝑎𝑡 𝑖𝑡𝑒𝑚

For example, if you have an item on which 13 out of 20 students


respond correctly, your IF index is 13 divided by 20 or .65 (65
percent). There is no absolute IF value that must be met to determine
if an item should be included in the test as is, modified, or thrown
out, but appropriate test items will generally have Ifs that range .15
and .85. Two good reasons for occasionally higher are including a
very easy item (.85 or higher) are to build in some affective feelings
of “success” among lower-ability students and to serve as warp-up
19

items. And very difficult item can provide a challenge to the highest-
ability students.
4.4.2 Item Discrimination (ID) is the extent to which an item
differentiates between high-and low-ability test takers. An item on
which high-ability students (who did well in the test) and low-ability
students (who didn't) score equally well would have poor ID
because it did not discriminate between the two groups. Conversely
the high-ability group and incorrect responses from the most of the
low-ability an item that garners correct responses from most of
group has good discrimination power.
Suppose your class of 30 students has taken a test. Once you have
calculated final scores for all 30 students, divide them roughly into
thirds-that is, create three rank-ordered ability groups including the
top 10 scores, the middle 10, and the lowest 10. To find out which
of your 50 or so test items were most "powerful" in discriminating
between high and low ability, eliminate the middle group, leaving
two groups with results that might look something like this on a
particular item:

Item #23 #Correct #Incorrect


High-ability Ss (top 10) 7 3
Low-ability Ss (bottom 10) 2 8

Using the ID formula (7- 2 = 5 : 10 = 50), you would find that this
item has an ID of .50, or a moderate level.
The formula for calculating ID is
20

ℎ𝑖𝑔ℎ 𝑔𝑟𝑜𝑢𝑝 #𝑐𝑜𝑟𝑟𝑒𝑐𝑡 − 𝑙𝑜𝑤 𝑔𝑟𝑜𝑢𝑝 #𝑐𝑜𝑟𝑟𝑒𝑐𝑡 7−2


𝐼𝐷 = =
1⁄ 𝑥 𝑡𝑜𝑡𝑎𝑙 𝑜𝑓 𝑦𝑜𝑢𝑟 𝑡𝑤𝑜 𝑐𝑜𝑚𝑝𝑎𝑟𝑖𝑠𝑜𝑛 𝑔𝑟𝑜𝑢𝑝𝑠 1⁄ 𝑥 20
2 2
5
= = 50
10

The result of this example item tells you that the item has a moderate
level of ID. High discriminating power would approach a perfect
1.0, and no discriminating power at all would be zero. In most cases,
you would want to discard an item that scored near zero. As with IF,
no absolute rule governs the establishment of acceptable and
unacceptable ID indices.

4.4.3 Distractor Efficiency is one more important measure of a multiple-


choice item's value in a test, and one that is related to item
discrimination. Consider the following. The same item (# 23 ) used
above is a multiple-choice item with e five choices and responses
across upper-and lower-ability students arc distributed s follows:

Choices A B *C D E
High-ability Ss (10) 0 1 7 0 2
Low-ability Ss (10) 3 5 2 0 0
*Note: C is the correct response

As shown above, its ID is .50, which is acceptable, but the item


might be improved in two ways: (a) Distractor D doesn't fool
anyone. No one picked it, e the and therefore it probably has no
utility. A revision might provide a distractor g like that actually
attracts a response or two. (b) Distractor E attracts more responses
(2) from the high-ability group than the low-ability group (0). Why
are good students choosing this one? Perhaps it includes a subtle
reference that entices the high group but it "over the head" of the
21

low group, and therefore the latter students don't even consider it.
The other two distractors (A and B) seem to be fulfilling their
function of attracting some attention from lower- ability students.

C. Scoring, Grading, and Giving Feedback

1. Scoring
As you design a classroom test, you must consider how t test will be scored and
graded. Your scoring plans reflect the relative weight that you place on each section
and items in each section.
Here are your decisions about scoring your test:

Percent of Possible Total


Total Grade Correct
Oral Interview 40% 4 scores, 5 to 1 range x 2 = 40
Listening 20% 10 items @ 2 points each = 20
Reading 20% 10 items @ 2 points each = 20
Writing 20% 2 scores, 5 to 1 range x 2 = 20
Total 100

2. Grading
Your first thought might be that assigning grades to student’s performance on
this test would be easy: just give an “A” for 90-100 percent, a “B” for 80-89 percent,
and the topic. How you assign letter grades to this test is a product of
 The country, culture, and context of this English classroom,
 Institutional expectations (most of them unwritten),
 Explicit and implicit definitions of grades that you have set forth,
 The relationship you have established with this class, and
22

 Student expectations that have been engendered in previous tests and quizzes in this
class.

3. Giving Feedback
A section on scoring and grading would not be complete without some
consideration of the forms in which you will offer feedback to your students, feedback
that you want to become beneficial wash back. In the example test that we have been
referring to here – which is not unusual in the universe of possible formats for periodic
classroom test- consider the multitude of options. You might choose to return the test
to the student with one of, or a combination of, any of the possibilities below:
1. A letter grade
2. A total score
3. Four sub scores (speaking, listening, reading, writing)
4. For the listening and reading sections
a. An indication of correct/incorrect responses
b. Marginal comments
5. For the oral interview
a. Scores for each element being rated
b. A checklist of areas needing work
c. Oral feedback after the interview
d. A post-interview conference to go over the results
6. On the essay
a. Scores for each element being rated
b. A checklist of areas needing work
c. Marginal and end-of-essay comments, suggestions
d. A post-test conference to go over work
e. A self-assessment
7. On all or selected parts of the test, peer checking of results
8. A whole-class discussion of results of the test
23

9. Individual conferences with each student to review the whole test.


In this chapter, guidelines and tools were provide to enable you to address the
five questions posed at the outset: (1) how to determine the purpose or criterion of the
test, (2) how to state objectives, (3) how to design specifications, (4) how to select and
arrange test task, including evaluating those task with item indices, and (5) how to
ensure appropriate washback to the student. This five part template can serve as pattern
as you design classroom tests.
Ways of Assessing Learners
Progress and achievement can also be assessed through written work (done in
class or for homework), class activities such as simulations or oral presentations, group
projects, and self assessment by the learners themselves. Clear assessment criteria are
needed. If the results are to be included in the overall course assessment of the learners,
the criteria should be understood and agreed upon by all the teachers involved.
24

CHAPTER III
CONCLUSION

There are five kinds of test types: Language aptitude tests, proficiency tests,
placement tests, diagnostic tests, and achievement tests. Every test must be a
wonderfully innovative instrument that will garner the accolades of the colleagues and
the admiration of the students.
In the test, we have some practical steps to test construction, they are: assessing
clear and unambiguous objectives, drawing up test specifications, devising test tasks,
and designing multiple-choice test items.
Evaluation can fulfill two functions: assessment and feedback. Assessment is a
matter of measuring what the learners already know. Any assessment should also
provide positive feedback to inform teachers and learners about what is still not known,
thus providing important input to the content and methods of future works.

Potrebbero piacerti anche