Reliability

Reliability & Validity
siti nur anis binti md azmi 2012306847
ON TARGET
ON TARGET
RELIABILITY The consistency or dependability of the scores obtained. A RELIABLE TEST Produces similar scores across various conditions and situations, including different evaluators and testing environments. Example 1: Typing test Example 2: A Weight Scale
ON TARGET
cont.
The scores obtained from one instrument can be reliable but not valid. Example: A researcher gave two forms (Form i and ii) to a second grade students to measure their knowledge on History of Malaysia. Any relationship between History and PE??
ON TARGET
DISTINCTION BETWEEN RELIABILITY AND VALIDITY Reliability and Validity always depend on the context on which an instrument is used. Depending on the context, the instrument may or may not give reliable/ consistent scores.
ON TARGET
cont.
The bulls eye represent the information desired Each dot represent the scores from an instrument A dot in the bulls eye indicate the information (the score) the researcher desires
ON TARGET
ERRORS OF MEASUREMENT
Whenever a person takes the SAME test TWICE, they seldom perform exactly the same. Why??
Motivation
Energy
Anxiety
Different testing situation
Other factors??
ON TARGET
cont.
Because error of measurement always present to
some degree, researchers expect some variation in

test scores whenever the instrument is administered more than one to the same group. Reliability estimates provide researchers how much variation to expect.
This is known as Reliability Coefficient.
ON TARGET
RELIABILITY COEFFICIENT
Def: the relationship between scores of the same individual on the same instrument at two different times.
Test-retest method Internal-consistency method
Equivalent-forms method
ON TARGET
TEST-RETEST METHOD
Def: Admistering the same test twice to the same
group with any time interval between tests.

Reliability Coefficient will be affected by the length of time that elapses between the two test. The longer the time interval, the lower the reliability coefficient to be since the changes in the
individual may happen.
ON TARGET
cont.
But then not all variable are equally stable, ie:

Personal Characteristics
Abilities
Mood
ON TARGET
EQUIVALENT-FORMS METHOD (ALTERNATE / PARALLEL)

Obtained by administering two equivalent tests to the same
group at the same time.

The contents are the same and reliability coefficient is calculated between the two sets of scores obtained.
It is necessary that the time frame between giving the two

forms be as short as possible
ON TARGET
cont.
Most acceptable estimate of reliability. (Therefore, most commonly used in research - similar in content, difficulty
level, arrangement, type of assessment, etc.)
ON TARGET
INTERNAL-CONSISTENCY METHOD
Internal Consistency Method
Split-Half Procedure
KuderRicherdson Approaches
Alpha Coefficient
ON TARGET
i) SPLIT-HALF PROCEDURE Test has two equivalent halves. Give test once, score two equivalent halves (odd items vs. even items).
Items on the instrument are divided into comparable

halves E.g., a scale divided so that the first half has the same score as the second. Looks at internal consistency. Weakness: difficulty to ensure that the two halves are equivalent
ON TARGET
cont.
The coefficient indicates the degree to which the two halves of the test provide the Same results and hence describe the internal consistency of the test.
The reliability is calculated using the Spearman-Brown prophecy formula.

Reliability of scores on total test = 2 X reliability for test -------------------------------1 + reliability for test
Thus, if we obtained a correlation coefficient of .56 by comparing one half of the test Items to the other half, the reliable scores of the total test would be: Reliability of scores on total test = 2 x .56 = 1.12 = .72 ---------- -----1 + .56 1.56
ON TARGET
cont.
ON TARGET
ii) KUDER-RICHARDSON APPROACHES
Estimate of homogeneity when items have a dichotomous response, e.g. yes/no items. Should be computed for a test on an initial reliability testing, and computed for the actual sample. Based on the consistency of responses to all of the items of a single form of a test
ON TARGET
cont.
The most employed method for determining internal-consistency, particularly KR20 and KR21. KR21 requires only three piece of information; number of items, mean, standard deviation. But, KR21 can only be used if it can be assumed that the items are all equal difficulty. KR21 reliability coefficient K --K-1 M(K-M) ---------K (SD)
1 -
K=Number of items on test, M=Mean, SD=Standard Deviation
ON TARGET
iii) ALPHA COEFFICIENT
Frequently called Cronbach alpha A general from of the KR20 formula to be used in calculating the reliability of items that are not scored right versus wrong (i.e., essay tests have more than one answer)
ON TARGET
Method Test-retest Equivalent forms
Content Identical Different
Time Interval Varies None Varies
Procedure Gives identical instrument twice Give two forms of instrument Give two forms of instrument, with time interval
Equivalent forms/ Different retest
Internal consistency
Different
None
Divide instrument into halves and score each using Kuder-Richarson approaches Compare scores obtained by two or more observer or scores
Scoring observer agreement
Identical
None
ON TARGET
STANDARD ERROR OF MEASUREMENT (SEMeas) Gives the margin or error that you should expect in an individual test score because of imperfect reliability of the test. For many IQ tests, the standard error of measurement over a one-year period and with different specific content is about 5 point. Over a 10 year period, it is about 8 points. This means tat a score fluctuates considerably more the longer the time between measurements. Thus, a person scoring 110 can expect to have a score between 100 and 20 one year later; but five years later between 94 and 126
ON TARGET
The formula for standard error of measurement is SD1-r11 where SD + standard deviation of scores and r11 = the reliability coefficient appropriate to the conditions that vary.
In the above example, the standard error (SEMeas) of 5 in the first example was obtained as follows; SD = 16, r11 = .90 SEM = 16 1-.90 = 16 .10 = 16 (.32) = 5.1
ON TARGET
SCORING AGREEMENT
Most tests and many other instruments are administered with specific directions and are scored objectively. Although differences in the resulting scores with different administrators or scores are still possible, it is generally considered highly unlikely that they would occur, except essays questions.
ON TARGET
Instruments that use direct observation are highly vulnerable to observer differences, so researchers who uses such instruments are obliged to investigate and report to the degree of scoring agreement.
Such agreement is enhanced by training the observers and by increasing the number of observation periods.
ON TARGET

Reliability

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Reliability

Caricato da

Copyright:

Formati disponibili

Reliability & Validity

siti nur anis binti md azmi 2012306847

Different testing situation

Because error of measurement always present to

some degree, researchers expect some variation in

This is known as Reliability Coefficient.

Test-retest method Internal-consistency method

Def: Admistering the same test twice to the same

group with any time interval between tests.

individual may happen.

But then not all variable are equally stable, ie:

EQUIVALENT-FORMS METHOD (ALTERNATE / PARALLEL)

group at the same time.

It is necessary that the time frame between giving the two

level, arrangement, type of assessment, etc.)

Internal Consistency Method

Items on the instrument are divided into comparable

The reliability is calculated using the Spearman-Brown prophecy formula.

ii) KUDER-RICHARDSON APPROACHES

K=Number of items on test, M=Mean, SD=Standard Deviation

iii) ALPHA COEFFICIENT

Method Test-retest Equivalent forms

Content Identical Different

Time Interval Varies None Varies

Equivalent forms/ Different retest

Scoring observer agreement

Potrebbero piacerti anche