A Critical Appraisal of Teacher (Punya A)

A CRITICAL ANALYSIS OF
TEACHER - MADE MULTIPLE

CHOICE GRAMMAR TEST
1. Introduction
1.1. Meaning and Kinds of Grammar
Grammar is a set of logical and structural rules that govern the

composition of sentences, phrases, and words in any given natural language.
The term refers also to the study of such rules, and this field includes
morphology and syntax, often complemented by phonetics, phonology,
semantics, and pragmatics.
Each language has its own distinct grammar. "English grammar" is the
set of rules within the English language itself. "An English grammar" is a
specific study or analysis of these rules. The English Grammar may be
separated into two common broad categories: descriptive and prescriptive. A
descriptive grammar tries to look at the grammar of any spoken language or
dialect as it actually exists, judging whether a sentence is grammatical or not
based on the rules of the speech group in which it is spoken, rather than an
arbitrary set of rules. A prescriptive grammar looks at the norms of speech as
given by authoritative sources, such as an upper-class or academic subculture,
and creates strict rules by which all speech within that language must abide to
be considered grammatical. Few linguists take a prescriptive approach to
Language Testing Page 1
grammar in the modern age, preferring to describe language as it exists in a
given speech community.
1.2. Reasons for Selecting Grammar Test
Frankly speaking, grammar is undoubtedly important for English learners

since it is the basic to form a correct sentence. There are still lots of people find
it hard to relate their ideas to English native speakers in a correct statement;
therefore, frequently native speakers misunderstand of what they said. Facing
this phenomenon, it is essential for the English learners, especially students, to
master the English grammar as well as the vocabulary. It may be argued that
correct use of grammar is an essential aspect of communicative competence. As
the teachers might know that grammatical accuracy is an integral part of
proficiency, but at the same time is a means to an end rather than an end in
itself, they have often tried to plan classes in which students may acquire
grammar by using the language in situations where it is needed and may be
practiced communicatively.
David Crystal and William Somerset Maugham (1938) both said

that mastering English grammar is very important in order to monitor the
meaning and effectiveness of the way we use the language. It can help foster
precision, detect ambiguity, and exploit the richness of expression available in
English. Considering the paramount importance of mastering a good grammar,
grammar test items are, therefore, worth conducting. A basic knowledge of
grammar underlies the ability to use language to express meaning, and so
grammar tests do have an important part to play in language programs.

The purpose of conducting grammar test is to assess an individual's
ability and to correct ungrammatical sentences especially in written English
accurately. In addition, it can also be used to assist teachers in making future
plan, improving appropriately instructional strategies and materials to achieve
the desired objectives. Some grammar tests which are tested to the students
frequently have low validity and are less reliable. Thus, this Analysis of Teacher
– Made Grammar Test Items is created in order to increase validity and
reliability of the test items. Hopefully, it will be very much helpful for both
teachers and students in respect to enhance the understanding of English
grammar.
2. Description of Grammar Test
2.1. Test Construction
The grammar test which is analysed in this paper was constructed in a

form of multiple choice tests. This construction is based on the materials which
have been discussed in class activities. Teachers have selected the materials
which are worth tested and have high tendency of being tested in the final
examination. The selected test items will hundred percent be taken from what
students have been taught and it will be made as clear as possible. The test
which is to be given has as high integrity as the exercises conducted during
class activities. A table of test specification below shows the areas of grammar
tested for the sake of this analysis which is taken teacher’s guide books.

Table of Test Specification
Item Total
No Course of Materials Objective Testing %
No Items
Based on Teacher’s
1 Comparisons 1-3 10% 3
guide book
2 Adjectives & Adverbs 4-6 10% 3
guide book
3 Quantifiers 7-9 10% 3
guide book
4 Relative Pronouns 10-12 10% 3
guide book
5 Prepositions 13-17 17% 5
guide book
6 Infinitives & Gerunds 18-21 13% 4
guide book
7 Causatives 22-24 10% 3
guide book
8 Phrasal Verbs 25-27 10% 3
guide book
Conditionals & Based on Teacher’s
9 28-30 10% 3
subjunctives guide book
Total 100% 30
In this multiple choice test items, the students will be presented with
thirty test items. Each item has five options and the stems are mostly
constructed in a simple sentence requiring one correct answer of five options.
The options consist of four distractors which as much is made closely similar to
the intended answer; thus, the students have to answer it carefully; otherwise,
they will be get trapped. The five options for each test item are typed vertically
from A to E and they are all in capital letters. In answering the test, the students
are instructed to choose the best correct answer by crossing or circling the letter
A, B, C, D, or E in their answer sheets. The answer sheets are provided

separately for students’ convenience. This multiple choice grammar test items is
intended solely for students’ improvement and is independently constructed by
the tester without any interference from second or third party concerned.
2.2. Test Administration
This grammar test is applied to English Faculty of Teacher Training and

Education Program of Mahasaraswati University Denpasar and it has been
administered to the Third semester students. The test was conducted in the
evening on November 19th, 2009 at Mahasaraswati Campus at Soka.
There are twenty students taking this test and all of them are seated as the
final test does where each student occupies one table. The students were
instructed to answer the test on a separate answer sheet provided and to choose
one correct answer by crossing or circling letter A, B, C, D, or E on the answer
sheet. Before doing the test, the students are instructed to empty the tables and
close their books. Furthermore, no one is allowed to cheat or peep other
students’ work. The students have to answer the test fairly and individually
based on their knowledge.
3. Test Result Analysis
3.1. Validity

The test is considered to have validity once it measures what it is
supposed to measure and nothing else which involves other skills in the same
time that cause confusion to the students. In this multiple choice grammar test
items, the tester tries to make as valid as possible each item of the test in order
to have a good measurement on students’ ability and understanding. The
validity of this test can actually be measured in four ways, but in this analysis,
the test constructor will only discuss three of four ways.
3.1.1. Face Validity
The test is considered to have good face validity if the test item looks
right to other testers, teachers, moderators, and testees (Heaton, 1989). He
argues that a test which has good face validity can maintain and heighten the
students’ motivation and vice versa. It is necessary to show the test to other
teachers or friends to have their point of view regarding the test whether or not
it is valid. It is important in order to avoid problems once the test is examined
by other people. In this multiple choice grammar test, the test items will be
overviewed by the lecturer to have high face validity.
Each item in this test is typed using Times New Roman font style with 14
in size and the line spacing applied is 1.5 lines which help the students read the
test easily. The test items are clearly and briefly written on A4 plain paper with
2.54 cm for all margins. By applying those criteria above, this test is intended to
have high face validity and the students will get motivated.
Face validity in most test designers’ point of view is regarded as the most
important of all types of test validity. It provides not only a quick and
reasonable guide but also a balance to too great a concern with statistical

analysis. Moreover, the students will get motivated if a test has good face
validity; thus, they will try harder to finish it.
3.1.2. Content Validity
Content Validity needs a careful analysis of the language being tested and
of the particular course objectives. The test is considered to have content
validity if the test items represent all the materials that have been taught or
discussed by the teacher. In this multiple choice grammar test, all the items have
been made through critical and careful selection of the materials discussed
previously. The content of the test is entirely taken from there and presented in
various percentage of each grammatical area. As nestled in the table of test
specification above, phrasal verbs occupy 10 per cent, conditionals and
subjunctives 10 per cent, adjectives and adverbs 10 per cent, prepositions 17 per
cent, comparisons 10, relative pronouns 10 per cent, gerunds and infinitives 13
per cent, quantifiers 10 per cent, and causatives 10 per cent. In this way, this
grammar test is intended to have good content validity on each item
representing each of the grammatical areas which have been discussed earlier.
3.1.3. Construct Validity
A test will have construct validity if it is capable of measuring certain

specific characteristics in accordance with a theory of language behavior and
learning (Heaton, 1989). Construct validity assumes the existence of certain
learning theories or constructs underlying the acquisition of abilities and skills.

In testing grammar, the best approach used to have high construct validity is the
structuralist approach. In this analysis, the method used to test grammar is
multiple choice item tests because it can cover many more areas of grammar
than other types of test. If the grammar is tested using communicative approach
or integrative approach, the test will then have less construct validity.
3.2. Reliability
The test is considered reliable if it is consistence in its scoring or

measurements. Reliability is a necessary characteristic of any good test because
it is used as measuring instrument. Bachman (1990) emphases that internal
consistency is concerned with how the consistent test taker’s performance on
discrepant parts of the test are with each other. If the test is administered to the
same candidates on different occasions but the result produced is different, the
test is not reliable. Reliability is extremely important in the use of both public
achievement and proficiency test, and classroom test.
There are some methods of estimating the reliability of individual test

items such as split – half method, Foelich’s internal consistency formula, K-
R20, and K-R21. In this multiple choice grammar test items analysis, the test
constructor will only use one method, which is K-R21, in order to find out the
extent of discrepancy of reliability coefficients of each item. Furthermore,
Frisbie (1988) highly recommends using K-R21 to interpret the reliability of
teacher – made tests.

3.2.1. K-R21 Formula
As what Frisbie recommends, K-R21 is so simple to use since it avoids

troublesome correlations and, in addition to the number of items in the test, it
involves only the test mean and standard deviation, both of which are normally
calculated anyhow as a matter of routine. Before using the K-R21 formula, there
are two steps to be accomplished to find mean and standard deviation.
Table 1 below is a detailed table of testees’ achievement which also

shows the score of each item.

A. Mean
In order to find the reliability coefficient of test, we must first find the
mean of all testees’ score using this formula by referring to data of table 1
above:

Where: M = the mean of testees’ score;
∑fx = the total value of all students by middle
score interval;
N = the number of the testees.
M= = 21,85
B. Standard Deviation
After we get the mean score of all testees, then we have to find out the
standard deviation of the test. Look at the table 2 below for the spread of
testees’ scores.
Table 2. The Spread of Testees'

Scores
21,85
N Testee Score (x-m) =
d²
O s (x) d
1 S 29 7,15 51,12
2 P 27 5,15 26,52
3 O 26 4,15 17,22

4 K 25 3,15 9,92
5 T 25 3,15 9,92
6 B 24 2,15 4,62
7 G 24 2,15 4,62
8 J 24 2,15 4,62
9 H 23 1,15 1,32
10 R 23 1,15 1,32
11 I 22 0,15 0,02
12 M 22 0,15 0,02
13 A 21 -0,85 0,72
14 Q 21 -0,85 0,72
15 E 20 -1,85 3,42
16 C 18 -3,85 14,82
17 N 18 -3,85 14,82
18 F 16 -5,85 34,22
19 L 15 -6,85 46,92
20 D 14 -7,85 61,62
308,
∑d² = 55
Based on the data from table 2 above, we will use the formula below to find the
standard deviation of the test.
s.d =
Where: s.d = standard deviation;

∑d2 = the total of squared mean
deviation (d);
N = the number of all testees.

s.d =
= 3,93
C. K-R21
All the data we need to get the reliability coefficient of the test has been
obtained. Using the K-R21 formula below, we are going to calculate our data to
find the reliability coefficient of the grammar test.
K-R21 = (1 - )
Where: N = the number of items in the test;

m = the mean score on the test for all the testees;
x = the standard deviation of all testees’ scores;
K-R21 = reliability formula

K-R21 = (1 - )
= 1,03 (1 - )
= 1,03 x 0,62
= 0,64
According to the calculation of the data above, the computation of reliability

coefficient of grammar test score using K-R21 results reliability coefficient of
0,64 which shows that the grammar test is reliable.
3.3. Item Analysis
The result of the objective grammar test has been obtained, but it is not
finished yet; otherwise, it still needs further analysis since the result of the test
can be used to provide valuable information concerning the performance of the
student as a group, of individual students, and of each of the items comprising
the test. The information concerning the students‘performance as a whole and of
individual students is very important for teaching purposes, especially as much
result can show not only the types of errors most frequently made but also the
actual reasons for the errors being made. The performance of the test items is
fairly importance in compiling future test. All items which have been tested
should be examined from the view of their difficulty level (facility value), and
their discrimination level (discrimination value).

3.3.1. Facility Value (FC)
The facility value of an item shows how easy or difficult the particular
item is. It is generally expressed the percentage of the correct answer per test
item. The formula used in calculating the facility value is:
FV = Where: FV = facility value

R = the number of correct answer
N = the number of students
The multiple choice grammar test items which have been tested to the students
consist of 30 items and the facility value of each item is shown in table 3 below.

Baker (1989) says that facility index is the proportion of correct responses
to total responses of an item. The item no 1, for instance (look at table 3 above),
15 out of 20 students answered the item correctly, and the facility value is 0,8. It
means that the item is considerably easy; thus, the difficulty of the item needs
ascending. This facility value can be useful when deciding the order of
ascending the difficulty for the overall test items. The test which has the highest
facility value should be put in the first position since it is the easiest one;
otherwise, the item which has the lowest facility value should be put in the last
position.
3.3.2. Discrimination Value (DV)
The discrimination value of an item indicates the extent to which the item
discriminates between the testees separating the high achiever students from the
low achiever students. There are various methods off obtaining the
discrimination value of each item of the text, one of them is by comparing the
top 27 per cent with the bottom 27 per cent as shown by the formula below. In
order to that, we will first assign the top group and the bottom group to get a
number of students occupying each group based on the data in table 4 below.
In assigning the upper and lower groups, we will apply this formula:
27% x N Where: N= the number of all testees
Using the formula above, we will have:
A. Upper Group
27% x 20 = 5.4

There will be 5 students in upper group. Please refer to the table 4 below to see
the students who occupy the top group. They are labelled in Green.
B. Lower Group
27% x 20 = 5.4
There will be 5 students in lower group. At table 4 below, the members of lower
group are labelled in orange. The rest of those two groups are called middle
group.
Table
4
21,85
Teste Score (x-m)
NO d²
es (x) =d
1 S 29 7,15 51,12
2 P 27 5,15 26,52
3 O 26 4,15 17,22
4 K 25 3,15 9,92
5 T 25 3,15 9,92
6 B 24 2,15 4,62
7 G 24 2,15 4,62
8 J 24 2,15 4,62
9 H 23 1,15 1,32
10 R 23 1,15 1,32
11 I 22 0,15 0,02
12 M 22 0,15 0,02
13 A 21 -0,85 0,72
14 Q 21 -0,85 0,72
15 E 20 -1,85 3,42
16 C 18 -3,85 14,82
17 N 18 -3,85 14,82
18 F 16 -5,85 34,22
19 L 15 -6,85 46,92
20 D 14 -7,85 61,62
∑d² 308,
N=20 = 55
From the data above, now we will find the discrimination value per item using
this formula:
Where: DV = Discrimination Value

CU= Correct Upper
CL = Correct Lower
N= the number of students in one group
Look at the table 5 below to see the discrimination value of each test item
between Upper and Lower groups:

3.4. Point of Revisions
As the data shown in the table of discrimination value that some items
cannot discriminate between high achiever students and low achiever students;
thus, those items need revising. The analysis has revealed that there 2 items out
of 30 are not acceptable for the next use of the text due to lack in discrimination
index. Those test items are item no 18 and 28. Those items need revising
whether it be the stem or the options of those items. After it has been revised, it
has to be retest again to the students and reanalysed in order to know whether or
not the revision makes the quality of the items better.

Those two indiscriminate test items can be revised by simplifying, or
changing the stems or distractors to be simpler. The test item no 18 and 28
seemed rather easy for both high achieving students and low ones since both
groups have same percentage of the correct answer. It means that those two
items cannot discriminate the high achieving students and low achieving
students; therefore, it needs to be revised.
4. Conclusion & Suggestions
4.1. Conclusion
The critical analysis of teacher - made multiple choice grammar tests has
been scrutinised; therefore, there are some points to be noted as follows:
Firstly, this grammar test items, which are entirely taken from teacher’s
guide book, consist of 30 items. 28 items have been revealed to have good
discriminating power since it has been able to discriminate between high
achieving students and low achieving students. Those 28 items can therefore be
applied for the upcoming use. Otherwise, the other 2 items which do not have
good discrimination powers should be revised. After revising, the test should be
retried out to measure its validity and reliability.
Secondly, the computation of reliability coefficient by means of KR-21has

resulted reliability coefficient of 0,64. It means that the grammar test items are
reliable tested for third semester of English Faculty of Teacher Training and
Education Program of Mahasaraswati University, and thus the scores are
dependable.

Thirdly, this grammar test has high content validity since all the materials
are taken from what have been taught previously by the tester. In addition, it is
well organised and neatly typed; thus, it has good face and construct validity.
4.2. Suggestions
The construction of this critical analysis of teacher - made multiple choice

grammar test items is really valuable, especially for the teachers; thus, the
following suggestions are preferable.
Firstly, the lecturer of English grammar at Mahasaraswati University are

recommended to make grammar test items as valid and reliable as possible
because the results of valid and reliable measuring instrument are much more
dependable. For the purpose of classroom use, the test should be valid at least in
term of the content validity so as the testees will feel it much more valuable
because it testes what is supposed to be tested. Besides, the reliability of the test
should also be maintained because it is a necessary characteristic of any good
test.
Secondly, the English grammar lecturers should increase their practical

knowledge and ability concerning language testing so the grammar test is not
only used as in the classroom scope, but also in most wider scope of language
testing.

REFERENCES
British Council/University of Cambridge Local Examinations Syndicate
English Language Testing Service
Heaton, J.B. 1982. Language Testing. Modern English Publications
Anderson, J. 1971. A technique for measuring reading comprehension and

readability. English Language Teaching Journal
Baker, D. 1989. Language Testing. A Critical Survey and Practical Guide.

London: Edward Arnold

Penny, UR. 1996. A Course in Language Teaching. Cambridge University Press
Johnson, K. 1995. Language Teaching and Skill Learning. Oxford Basil

Blackwell
Hughes, A. 1689. Testing for Language Teachers. Cambridge University Press
Green, A. J. 1975. Teacher Made Test. New York
APPENDICES

A Critical Appraisal of Teacher (Punya A)

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

A Critical Appraisal of Teacher (Punya A)

Caricato da

Copyright:

Formati disponibili

A CRITICAL ANALYSIS OF

TEACHER - MADE MULTIPLE

1.1. Meaning and Kinds of Grammar

Grammar is a set of logical and structural rules that govern the

1.2. Reasons for Selecting Grammar Test

Frankly speaking, grammar is undoubtedly important for English learners

David Crystal and William Somerset Maugham (1938) both said

Language Testing Page 2

2. Description of Grammar Test

2.1. Test Construction

The grammar test which is analysed in this paper was constructed in a

Language Testing Page 3

Language Testing Page 4

2.2. Test Administration

This grammar test is applied to English Faculty of Teacher Training and

3. Test Result Analysis

Language Testing Page 5

3.1.1. Face Validity

Language Testing Page 6

3.1.2. Content Validity

3.1.3. Construct Validity

A test will have construct validity if it is capable of measuring certain

Language Testing Page 7

The test is considered reliable if it is consistence in its scoring or

There are some methods of estimating the reliability of individual test

Language Testing Page 8

As what Frisbie recommends, K-R21 is so simple to use since it avoids

Table 1 below is a detailed table of testees’ achievement which also

Language Testing Page 9

Language Testing Page 10

Table 2. The Spread of Testees'

Language Testing Page 11

Where: s.d = standard deviation;

Language Testing Page 12

Where: N = the number of items in the test;

Language Testing Page 13

According to the calculation of the data above, the computation of reliability

3.3. Item Analysis

Language Testing Page 14

FV = Where: FV = facility value

Language Testing Page 15

3.3.2. Discrimination Value (DV)

27% x N Where: N= the number of all testees

Using the formula above, we will have:

Language Testing Page 17

Where: DV = Discrimination Value

Language Testing Page 19

Language Testing Page 20

4. Conclusion & Suggestions

Secondly, the computation of reliability coefficient by means of KR-21has

Language Testing Page 21

The construction of this critical analysis of teacher - made multiple choice

Firstly, the lecturer of English grammar at Mahasaraswati University are

Secondly, the English grammar lecturers should increase their practical

Language Testing Page 22

British Council/University of Cambridge Local Examinations Syndicate

English Language Testing Service

Heaton, J.B. 1982. Language Testing. Modern English Publications

Anderson, J. 1971. A technique for measuring reading comprehension and

Baker, D. 1989. Language Testing. A Critical Survey and Practical Guide.

Language Testing Page 23

Johnson, K. 1995. Language Teaching and Skill Learning. Oxford Basil