Sei sulla pagina 1di 20

Validity and Reliability

Chapter Eight
Dr Nek Kamal Yeop Yunus
Faculty of business & economics
Sultan Idris Education University

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Validity and Reliability


Chapter Eight

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Validity

Validity has been defined as referring to the


appropriateness, correctness, meaningfulness, and
usefulness of the specific inferences researchers
make based on the data they collect.
It is the most important idea to consider when
preparing or selecting an instrument.
Validation is the process of collecting and analyzing
evidence to support such inferences.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Evidence of Validity

There are 3 types of evidence a researcher


might collect

McGraw-Hill

Content-related evidence of validity

Content and format of the instrument


Criterion-related evidence of validity

Relationship between scores obtained using the


instrument and scores obtained
Construct-related evidence of validity

Psychological construct being measured by the


instrument

2006 The McGraw-Hill Companies, Inc. All rights

Illustration of Types of Evidence of Validity (Figure 8.1)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Content-related Evidence

A key element is the adequacy of the sampling


of the domain it is supposed to represent.
The other aspect of content validation is the
format of the instrument.
Attempts to obtain evidence that the items
measure what they are supposed to measure
typify the process of content-related evidence.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Criterion-related Evidence

A criterion is a second test presumed to measure


the same variable.
There are two forms of criterion-related validity:
1)

2)

Predictive validity: time interval elapses between


administering the instrument and obtaining
criterion scores
Concurrent validity: instrument data and criterion
data are gathered and compared at the same time

A Correlation Coefficient (r) indicates the degree


of relationship that exists between the scores of
individuals obtained by two instruments.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Construct-related Evidence

Considered the broadest of the three


categories.
There is no single piece of evidence that
satisfies construct-related validity.
Researchers attempt to collect a variety of
types of evidence, including both contentrelated and criterion-related evidence.
The more evidence researchers have from
different sources, the more confident they
become about the interpretation of the
instrument.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Reliability

Refers to the consistency of scores or answers


provided by an instrument.
Scores obtained can be considered reliable but
not valid.
An instrument should be reliable and valid
(Figure 8.2), depending on the context in
which an instrument is used.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Reliability and Validity

McGraw-Hill

(Figure 8.2)

2006 The McGraw-Hill Companies, Inc. All rights

Reliability of Measurement

McGraw-Hill

(Figure 8.3)

2006 The McGraw-Hill Companies, Inc. All rights

Errors of Measurement

Because errors of measurement are always


present to some degree, variation in test
scores are common.
This is due to:

McGraw-Hill

Differences in motivation
Energy
Anxiety
Different testing situation

2006 The McGraw-Hill Companies, Inc. All rights

Reliability Coefficient

Expresses a relationship between scores of


the same instrument at two different times
or parts of the instrument.
The 3 best known methods are:

McGraw-Hill

Test-retest
Equivalent forms method
Internal consistency method

2006 The McGraw-Hill Companies, Inc. All rights

Test-Retest Method

Involves administering the same test twice to the


same group after a certain time interval has
elapsed.
A reliability coefficient is calculated to indicate
the relationship between the two sets of scores.
Reliability coefficients are affected by the lapse of
time between the administrations of the test.
An appropriate time interval should be selected.
In Educational Research, scores collected over a
two-month period is considered sufficient
evidence of test-retest reliability.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Equivalent-Forms Method

Two different but equivalent (alternate or


parallel) forms of an instrument are administered
to the same group during the same time period.
A reliability coefficient is then calculated between
the two sets of scores.
It is possible to combine the test-retest and
equivalent-forms methods by giving two different
forms of testing with a time interval between the
two administrations.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Internal-Consistency Methods

There are several internal-consistency methods that


require only one administration of an instrument.
Split-half Procedure: involves scoring two halves of a test
separately for each subject and calculating the correlation
coefficient between the two scores.
Kuder-Richardson Approaches: (KR20 and KR21) requires
3 pieces of information:

Number of items on the test


The mean
The standard deviation

Considered the most frequent method for determining


internal consistency

Alpha Coefficient: a general form of the KR20 used to


calculate the reliability of items that are not scored right
vs. wrong.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Standard Error of Measurement

An index that shows the extent to which


a measurement would vary under
changed circumstances.
There are many possible standard errors
for scores given.
Also known as measurement error, a
range of scores that show the amount of
error which can be expected. (Appendix
D)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Scoring Agreement

Scoring agreement requires a demonstration that


independent scorers can achieve satisfactory
agreement in their scoring.
Instruments that use direct observations are highly
vulnerable to observer differences.
What is desired is a correlation of at least .90 among
scorers as an acceptable level of agreement.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Any questions?

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Thank You

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Potrebbero piacerti anche