Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
RELIABILITY VS VALIDITY
Whether the test really measures what it is suppose to measure Whether the test result is consistent as a function of time and circumstances?
VALIDITY
Confirms whether the test really measures what is supposed to measure Test validity is also the extent to which the conclusions and decisions made on the basis of the test scores are appropriate and meaningful
Types Of Validity:
Content Validity Ensures how far an examination covers the scope of study that will be tested on. To achieve,
must
decide the content first must make balanced choices so that the topics that are tested reflect the overall knowledge that is learnt
Forecast Validity The accuracy of a test to predict the real ability of a pupil to carry out a certain test in a different situation in the future Eg: trial exam To achieve,
The
item should really test the aspects relating to truth, ability, skills or qualifications that are needed
Concurrent Validity The achievement of individuals in a test that matches their achievement in another valid test Eg: test is repeated, same score To achieve,
Build
Group Validity a test that is used to measure trait, behaviour or mans mental state- :
Brain ability Creativity Motivation Tolerance
Not easy to evaluate Eg: a test that can differentiate between pupils creative ability
RELIABILITY
Repeatability/consistency Refers to achievement that has accuracy in test scores that are repeated If the test yields the same result when measuring an individual or group in two different occasions then the test is reliable
Eg 1: if different teachers mark the same essay using the same criteria and marking scheme and obtain the same score, then the scores are reliable from one to another Eg 2: if a candidate gets 90 marks in a test, he should get about the same marks if the test is repeated after a few days.
Test scores that are based on too few items : test length is short Range of scores is too limited due to the homogeneity of the group tested Testing conditions are inadequate Scoring is subjective
To ensure accuracy, uniformity and consistency of evaluation instruments produced by a teacher every year
Test- Retest
Essentially a measure of examinee reliability, an indication of how consistently examinees perform on the same set of tasks The easiest method is to give the same test twice This would yield two scores for each individual tested and the correlation between the first set of scores and the second set of scores is determined
Reliability is necessary for validity and its estimation procedures are easier to follow than validity Although reliability is necessary in order to have a valid measure, it does not guarantee that a reliable measure will be valid That means reliability is not a sufficient condition for validity
TYPES OF EVALUATION
1)
2) 1) 2) 1) 2)
TEACHER OBSERVATION Checklist Rating Scale PUPIL REPORT Self-Report Peer-Report PUPIL RESPONSE Objective Essay
OBJECTIVES TESTS
Objective tests are highly structured Require students to select the correct answer among a limited number of alternatives
1)
2)
3)
4)
Chatterji(2003) Each item must be designed to measure only the selected outcome from the domain Items must elicit the desired performances in as direct a manner as possible Each item should deal with a single question or concept with one clearly correct or best answer Items should not provide clues, grammatical to the respondents
The purpose of the distractors is to appear as plausible solutions to the problem for those students who have not achieved the objectives being measured by the test item The distractors must appear as implausible solutions for those students who have achieved the objective. Only the answer should appear plausible to these students
In terms of versatility, multiple choice items are appropriate for use in many different subject matter areas and can be used to measure a variety of educational objectives Also adaptable to various level of learning outcomes, from simple recall of knowledge to more complex level such as analysis, synthesis and evaluation
Enables the test designer to improve the item by replacing distractors that are not functioning properly The distractors chosen may be used to diagnose misconceptions of the student or weaknesses in the teachers instruction
Can easily be justified in terms of its validity Well-written multiple-choice items are more reliable compared to other test item formats Less susceptible to guessing than true/false items Scoring more clear-cut than short answer test scoring because there are no misspelled or partial answers to deal with
Are objectively scored, they are not affected by scorer inconsistencies as are essay questions Offer a high degree of efficiency
Construct each item to assess a single written objective Base each item on a specific problem stated clearly in the stem Include as much of the item as possible in the stem, but do not include irrelevant information State the stem in positive form
Keep the alternatives free from clues Avoid the alternatives all of the above and none of the above Present the answer in each of the alternative positions approximately an equal number of times, in a random order
Lay out the item in a clear and consistent manner Analyze the effectiveness of each item after each administration of the test
The score of objective items can be influenced by the error of guessing Dealing with the problem of guessing is to adjust scores with a correction-for-guessing formula :
R is the number of items answered correctly W is the number of items answered incorrectly n is the number of answer choices for an item