Sei sulla pagina 1di 12

1.

Introduction

Teachers all over the world develop test items and administer them (test
items) to their learners. Seema explained the concepts test items as a method
that is being used to review items in a test, both qualitatively and
quantitatively. In this report a quantitative or statistical approach is going to be
used to determine the reliability and validity of each test item. The responses
of each student to each test item will be shown in figures and tables which will
subsequently be interpreted to test whether each test met the minimum
quality control criteria.

According to Seema Varna, most of these test items are diagnostic tools and
are not meant to measure growth. The diagnostic nature of most developed
test items make them to be of inferior quality, and makes them prone to not
being able to test their reliability and validity.

The test items administered to the learners in the classrooms enable the
teachers to identify learners with problems, it also helps with class
instructions, curriculum and teacher development. For test items to be fully
operational, the teacher should make thorough preparations of the test items
that will be administered to the learners.

The thorough planning that could be incorporated into the development of the
test items, it is to check each test items for quality. These will enable the
teacher to obtain the highest quality and also ensure that reliable test results
are obtained.

2. Purpose of the report


The purpose of this report is to disseminate information pertaining to the
descriptive statistics done on 20 multiple choice questions which was
administered to 25 students.

3. Test analysis

3.1 Descriptive statistics


Descriptive statistics refers to the use of statistics to depict the set of scores’
central tendency, how they are dispersed from one another and how they vary
from one another. In short, it refers to the mean, mode, median and the
standard deviation. In this report, a set of test scores were used to calculate
the mean, the mode, the median and the standard deviation. The output was
the figures shown in Table 1 for each calculation, i.e. the calculation of the
mean, the median, the mode and the standard deviation. The mean is the
average of a set of scores. The median is the midpoint or the middle value of
a distribution. The mode is the number or the score that appear or occur most
frequently than the other numbers in a set of scores.

1
Refer to Table 1 to see calculated descriptive statistics.

Table 1: Descriptive statistics


Mean 65.79
Mode 65.00
Median 65.00
STDEV2 479.57
STDEV 21.90

The descriptive statistics can be interpreted as being a normal distribution as


the central tendency of the mean, the mode and the median are the same. In
a bell shaped curve or a normal distribution 68% of the test scores fall within
one standard deviation to the mean. About 95% fall within two standard
deviation of the mean and about 99% fall within three standard deviation of
the mean.

Refer to Figure 1 to see the normal distribution curve

Figure 1: Normal distribution curve

3.2 Frequency graphs

3.2.1 Grouped frequency table


The grouped frequency table’s calculations are used to draw graphs. The
highest score obtained in the test item is 100 and the lowest score is 15. The
range was obtained by subtracting the lowest score from the highest score
and it gave us 85. The number of intervals can be decided upon by an
individual teacher or researcher. The size of the interval is obtained by
dividing the Range by the number of intervals, of which the quotient was 8.5.

Refer to Table 2 to see Grouped frequency table

2
Table 2: Grouped frequency table
H 100
L 15
Range 85
Number of intervals 10
Size of intervals 8.5

3.2.2 Cumulative frequency distribution


To calculate the cumulative frequency distribution, you have to add 1 to the
first frequency value and add 1 to the total sum of the added frequency value
plus 1, this you continue to do until you arrive at the last frequency value.

Refer to Table 3 to see the Cumulative frequency distribution

Table 3: Cumulative frequency distribution

Lower Upper Middle Cumulative


limit limit Interval value Frequency frequency
15.00 24 15 -24 19.5 1 1
25.00 34 25 - 34 29.5 2 3
35.00 44 35 - 44 39.5 0 3
45.00 54 45 - 54 49.5 4 7
55.00 64 55 - 64 59.5 3 10
65.00 74 65 - 74 69.5 6 16
75.00 84 75 - 84 79.5 1 17
85.00 94 85 - 94 89.5 6 23
95.00 104 95 - 104 99.5 2 25

3.2.3 Frequency histogram


A histogram is a type of summarising data either in the form discrete and
continuous interval scale. It is mainly used to illustrate the major distribution of
data in a convenient way. A histogram divides the range of possible values in
a set of data into groups or classes. For every class or group of data, a
rectangle is with a base length equal to the range values in the specific group.
The result may be that the rectangles may be of different height.

Refer to Figure 2 to see the frequency histogram

3
Figure 2: Frequency histogram

Frequency Histogram

7
6
Frequency

5
4
3
2
1
0
15-24 25-34 35-44 45-54 55-64 65-74 75-84 85-94 95-104

Interval

3.2.4 Frequency polygon


The middle values were plotted against the frequency and a straight line
drawn on the joining points of the values and the frequencies. The points
obtained were (19.5 & 1), (29.5 & 2), (39.5 & 0), (49.5 & 4), (59.5 & 3), (69.5 &
6), (79.5 & 1), (89.5 & 6) and (99.5 & 2).

Refer to Figure 3 to see the Frequency Polygon

Figure 3: Frequency polygon

Frequency Polygon
7
6 69.5 89.5
Frequency

5
4 49.5
3 59.5
2 29.5 99.5
1 19.5 79.5
0 39.5
9.5 19.5 29.5 39.5 49.5 59.5 69.5 79.5 89.5 99.5
Middle values

3.2.5 Cumulative frequency graph (An ogive)


An ogive is a cumulative frequency polygon, and is sometimes presented in a
percentage form. It is plotted on the X and Y axis. Its major use is to estimate

4
the percentile. The important percentile of the ogive is the median, which is
50%, the lower quartile, which is 25% and the upper quartile which is 75%.

Refer to Figure 4 to see the Cumulative Frequency Graph

Figure 4: Cumulative frequency graph (An ogive)

Cumulative frequency graph (An


Ogive)

30
Cumulative
Frequency

25
20
15
10
5
0
14 24 34 44 54 64 74 84 94 104

Upper Values

3.3 Reliability coefficients


Reliability means the extent to which the test will consistently yield the same
test scores. The test scores are free from random errors of measurement.

A test score of 1.00 has a standard error of zero, which means that it is
perfectly reliable

Refer to Table 4 to see the reliability coefficients

Table 4: Reliability coefficients


K 20
K–1 19
Total pq 3.83
STDEV 21.90
(STDEV)2 479.57
KR20 1.04

The reliability coefficients of the test scores of the 20 multiple choice test
items were perfect as the score is 1.04. It means that the test will yields the
same results even if it can be administered to other students.

5
4. Item analysis
The concept means to test the quality of test items by examining the
responses of students to each test item. The process uses mostly the difficulty
and discrimination indices.

4.1 Difficulty index


The concepts refer to the number of students who answered each test item
correctly, and the number of those who answered each test item incorrectly. It
is a way of indicating the difficulty of each test item and thus its quality. If all
students have answered a test item correctly, it could indicate that the test
item was too easy, and if one test item was not answered correctly by any of
the student, then the test item could have been too difficult.

Refer to Table 5 to see the Difficulty index

Table 5: Difficulty index (p)

Difficulty index

# Questions # Correct # Answered p


Question 1 21 25 0. 84
Question 2 22 25 0. 88
Question 3 17 25 0. 68
Question 4 12 25 0. 48
Question 5 21 25 0. 84
Question 6 17 25 0. 68
Question 7 11 25 0. 44
Question 8 12 23 0. 52
Question 9 13 25 0. 52
Question10 8 24 0. 33
Question 11 23 25 0. 92
Question 12 19 25 0. 76
Question 13 15 25 0. 6
Question 14 21 25 0. 84
Question 15 20 25 0. 8
Question 16 22 24 0. 92
Question 17 15 24 0. 63
Question 18 8 24 0. 33
Question 19 13 25 0. 52
Question 20 16 25 0. 64

6
4.2 Interpretation of the difficulty level of questions
The test items are now analysed individually to see the difficulty level of each.
20 questions were answered of which questions 1, 2, 5, 11, 12, 14 and 16
were unacceptable, as they were too easy. Questions 3, 4, 6, 7, 8, 9, 10, 13,
15, 17, 18, 19 and 20 were acceptable, meaning that they were fine and not
difficult. In percentage form we can say that 35 % of the 20 questions were
unacceptable and 65 % were acceptable.

Refer to Table 6 to see the Interpretation of the difficulty level of questions

Table 6: Interpretation of the difficulty level of


questions

# Questions Proportion Interpretation Reason


Question 1 0. 84 Unacceptable Too easy
Question 2 0. 88 Unacceptable Too easy
Question 3 0. 68 Acceptable Fine
Question 4 0. 48 Acceptable Fine
Question 5 0. 84 Unacceptable Too easy
Question 6 0. 68 Acceptable Fine
Question 7 0. 44 Acceptable Fine
Question 8 0. 52 Acceptable Fine
Question 9 0. 52 Acceptable Fine
Question10 0. 33 Acceptable Fine
Question 11 0. 92 Unacceptable Too easy
Question 12 0. 76 Unacceptable Too easy
Question 13 0. 6 Acceptable Fine
Question 14 0. 84 Unacceptable Too easy
Question 15 0. 8 Acceptable Fine
Question 16 0. 92 Unacceptable Too easy
Question 17 0. 63 Acceptable Fine
Question 18 0. 33 Acceptable Fine
Question 19 0. 52 Acceptable Fine
Question 20 0. 64 Acceptable Fine

4.3 Discrimination index (D)


This is an extent to which a test differentiates between high scoring students
and low scoring students. The discrimination index usually ranges from -1.00
to +1.00. Items in negative discrimination are the items that usually need to be
rewritten by the students and the one that the teacher has to develop in such
a way that the reliability and validity of the test items can be ensured.

Refer to Table 8 to see the Discrimination index

7
Table 7: Discrimination index (D)

Discrimination index

#U #L D
15 6 0. 60
15 7 0. 53
14 3 0. 79
8 4 0. 50
15 6 0. 60
12 5 0. 58
9 2 0. 78
10 2 0. 80
10 3 0. 70
8 0 1. 00
14 9 0. 36
14 5 0. 64
12 3 0. 75
15 6 0. 60
14 6 0. 57
15 7 0. 53
12 3 0. 75
5 3 0. 40
12 1 0. 92
11 5 0. 55

The discrimination index questions were all in the positive values. It means
that the high scoring students were able to choose the key and not the
destructors in the test items of the multiple choices.

4.4 Number of students in upper and lower group


The concept refer to the measure of the test item’s ability to can differentiate
between the students who are more likely to can answer each test item
correctly and those who are likely to can answer each test item incorrectly.

Refer to Table 8 to see the number of students in the upper and lower group

Table 8: Number of students in upper and lower


group
Upper 15
Lower 10

8
The number of the students who are in the upper group is 15. It means that
60% of the students were able to discriminate the correct answer from all the
destructors and 40% were not able to discriminate the correct key from all the
other destructors.

5. Conclusion
The reliability and validity of test items cannot be stressed hard enough. A
teacher has to meticulously plan each and every test item to ensure that the
minimum high quality of test items is maintained. Each and every test item
should be of a high quality, and even if it is only one test item out of twenty,
then that test item should be develop up until the highest quality is achieved.
Test items should always test that which they were purported to test and they
should always yield the same results consistently. Teachers should
incorporate the use of standardize testing, in this case it was multiple choice
test items with testing techniques that incorporates higher order cognitive
skills such as performance based and criterion based assessments. These
can be in the form of essays, open ended problems, interviews and oral
presentation. For the test items to be adequately be assessed, a
predetermined evaluation criteria should be used by the teacher to ensure the
highest level of reliability is maintained

9
6. References
1. Glenwood high school [Image] Retrieved October 10, 2007 from
http://www.glenwoodhighschool.co.za/images/photos/exams2003.jp
g

2. Glossary of Measurement Term. Retrieved October 05, 2007 from


http://harcourtassessment.com

3. Kubiszyn, T & Borich G. (2007) Educational Testing and


Measurement. Classroom Application and Measurement. 8th Edition. 
United States of America. John Wiley & Sons, Inc.

4. Valerie J Easton & McColl, J.H. Statistical Glossary (n.d.). Retrieved


October 8, 2007 from
http://www.stats.gla.ac.uk/steps/glossary/index.html

5. Varna, S. (2007). Retrieved October 10, 2007 from


http://www.descriptive.statistics.gla.ac.uk/homenet.html

6. Hunt, N (2002). Ogive. Retrieved October 10, 2007 from


http://home.ched.coventry.ac.uk/Volume/vol0/ogive.htm

10
7. Appendices

7.1 Appendix A
Prop Prop
#Question #Correct #Incorrect Correct Incorrect pq
(p)
Question 1 21 4 0. 84 0.16 0. 13
Question 2 22 3 0. 88 0.12 0. 12
Question 3 17 8 0. 68 0. 32 0. 22
Question 4 12 13 0. 48 0. 52 0. 25
Question 5 21 4 0. 84 0. 16 0. 13
Question 6 17 8 0. 68 0. 32 0. 22
Question 7 11 14 0. 44 0. 56 0. 25
Question 8 12 11 0. 52 0. 48 0. 25
Question 9 13 12 0. 52 0. 48 0. 25
Question 10 8 16 0. 33 0. 67 0. 22
Question 11 23 2 0. 92 0. 08 0. 07
Question 12 19 6 0. 76 0. 24 0. 18
Question 13 15 10 0. 6 0. 4 0. 24
Question 14 21 4 0. 84 0. 16 0. 13
Question 15 20 5 0. 8 0. 2 0. 16
Question 16 22 2 0. 92 0. 08 0. 08
Question 17 15 9 0. 63 0. 38 0. 23
Question 18 8 16 0. 33 0. 67 0. 22
Question 19 13 12 0. 52 0. 48 0. 25
Question 20 16 9 0. 64 0. 36 0. 23
Total 3.83

11
7.2 Appendix B: Spreadsheet of the test items answered by 25 students

Key C B D D B C D A C B A C B D A A C D B C
St No Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20
1 C B B A C D A D D A D A A A A C B D B
2 C B D D B D A A C B A C B D A A C D B C
3 C B D D B C D A C B A C B D A A C B D C
4 C B D B B C B A C B A C A D C A C B C C
5 C B D C B C B A C D A C B D A A A B B C
6 C A D D C C A D C D A C A D A A A B D C
7 B B A B B C B B D D A C B D C A A D D C
8 C B D B B C B D B C A C B D A A C A B A
9 C B D A B C D D B D A C B D A A C B D A
10 C B B A B C D C D C A B A D D A C D B C
11 C B D D B C D A C B A C B D A A C D B C
12 C B D D B C D D D A A C A D A A C B B D
13 C B D A B C D A C B A C B D A A A B B C
14 C B D A B C D A C B A C B D A A A B C
15 C B D D B B A A B D A C D A A C B B D D
16 C B D D B C D A C B A C B D A A C D B C
17 B B C C B A D D C A D B D A C A D
18 C B B D B A D D D D A C A D A A C B B C
19 D C A D B A B A D C C D A A D B B B B
20 C B D D B C D A C A C D B D A A C D B C
21 C A D D C C A D C D A C A D A A A B D C
22 B B A B B C B B D D A C B D C A A D D C
23 C B D B B C B D B C A C B D A A C A B A
24 C B B A C D A D D A D A A A A C B D B
25 C B D D B D A A C B A C B D A A C D B C

12

Potrebbero piacerti anche