Sei sulla pagina 1di 16

By:

video.edhole.com
video.edhole.com
Measurement Error
Whatever measurement we might make with regard to
some psychological construct, we do so with some amount
of error
Any observed score for an individual is their true score with error
added in
There are different types of error, but here we are
concerned with a measures inability to capture the true
response for an individual
Observed Score = True score + Error of measurement
video.edhole.com
Reliability
Reliability refers to a measures ability to capture an individuals true
score, i.e. to distinguish accurately one person from another
While a reliable measure will be consistent, consistency can actually be
seen as a by-product of reliability, and in a case where we had perfect
consistency (everyone scores the same and gets the same score
repeatedly), reliability coefficients could not be calculated
No variance/covariance to give a correlation
The error in our analyses is due to individual differences but also the
lack of the measure being perfectly reliable
video.edhole.com
Reliability
Criteria of reliability
Test-retest
Test components (internal consistency)
Test-retest reliability
Consistency of measurement for individuals over time
The score similarly e.g. today and 6 months from now
Issues
Memory
If too close in time the correlation between scores is due to memory of item responses rather
than true score captured
Chance covariation
Any two variables will always have a non-zero correlation
Reliability is not constant across subsets of a population
General IQ scores good reliability
IQ scores for college students, less reliable
Restriction of range, fewer individual differences
video.edhole.com
Internal Consistency
We can get a sort of average correlation among items
to assess the reliability of some measure
1

As one would most likely intuitively assume, having
more measures of something is better than few
It is the case that having more items which correlate
with one another will increase the tests reliability
video.edhole.com
Whats good reliability?
While we have conventions, it really kind of depends
As mentioned reliability of a measure may be different for
different groups of people
What we may need to do is compare reliability to those
measures which are in place and deemed good as well as
get interval estimates to provide an assessment of the
uncertainty in our reliability estimate
Note also that reliability estimates are biased upwardly and
so are a bit optimistic
Also, many of our techniques do not take into account the
reliability of our measures, and poor reliability can result in
lower statistical power i.e. an increase in type II error
Though technically increasing reliability can potentially also lower
power
1
video.edhole.com
Replication and Reliability
While reliability implies replicability, assessing reliability does not provide a
probability of replication
Note also that statistical significance is not a measure of reliability or replicability
1
Replication is not perhaps conducted as much as should be in psychology for a
number of reasons
Practical concerns, lack of publishing outlets etc.
Furthermore, knowing our estimates are biased and variable themselves, we
might even think that in many cases we would not expect consistent research
findings
In psychology, many people spend a lot of time debating back and forth about
the merits of some theory, citing cases where it did or did not replicate
However the lack of replication could be due to low power, low reliability,
problem data, incorrectly carrying out the experiment etc.
In other words, we didnt repeat because of methodology, not because the theory was
wrong

video.edhole.com
Factors affecting the utility of replications
You cant step in the same river twice!
Heraclitus
1
When
Later replications are not providing as much information, however
they can contribute greatly to the overall assessment of an effect
Meta-analysis
How
There is no perfect replication (different people involved, time it
takes to conduct etc.)
Doing exact replication gives us more confidence in the original
finding (should it hold), but may not offer much in the way of
generalization
Example: doing a gender difference study at UNT over and over. Does it
work for non-college folk? People outside of Texas?
video.edhole.com
Factors affecting the utility of replications
By whom
It is well known that those with a vested interest in some idea tend
to find confirming evidence more than those that dont
Replications by others are still being done by those with an interest
in that research topic and so may have a precorrelation inherent in
their attempt
Direct: correlation of attributes of persons involved
Indirect: correlation of data to be obtained
Gist, we cant have truly independent replication attempts,
but must strive to minimize bias
The more independent replication attempts are, the more
informative they will be
video.edhole.com
Validity
Validity refers to the question of whether our
measurements are actually hitting on the construct we
think they are
While we can obtain specific statistics for reliability (even
different types), validity is more of a global assessment
based on the evidence available
We can have reliable measurements that are invalid
Classic example: The scale which is consistent and able to
distinguish from one person to the next but actually off by 5 pounds
video.edhole.com
Validity Criteria in Psychological Testing
Content validity
Criterion validity
Concurrent
Predictive
Construct-related validity
Convergent
Discriminant

Content validity
Items represent the kinds of material (or content areas) they are supposed to
represent
Are the questions worth a flip in the sense they cover all domains of a given
construct?
E.g. job satisfaction = salary, relationship w/ boss, relationship w/ coworkers etc.
video.edhole.com
Validity Criteria in Psychological Testing
Criterion validity
the degree to which the measure correlates with various outcomes
Does some new personality measure correlate with the Big 5
Concurrent
Criterion is in the present
Measure of ADHD and current scholastic behavioral problems
Predictive
Criterion in the future
SAT and college gpa
video.edhole.com
Validity Criteria in Psychological Testing
Construct-related validity
How much is it an actual measure of the construct of interest
Convergent
Correlates well with other measures of the construct
Depression scale correlates well with other dep scales
Discriminant
Is distinguished from related but distinct constructs
Dep scale != Stress scale
video.edhole.com
Validity Criteria in Experimentation
Statistical conclusion validity
Is there a causal relationship between X and Y?
Correlation is our starting point (i.e. correlation isnt causation, but does lead to it)
Related to this is the question of whether the study was sufficiently sensitive to pick
up on the correlation
Internal validity
Has the study been conducted so as to rule out other effects which were controllable?
Poor instruments, experimenter bias
External validity
Will the relationship be seen in other settings?
Construct validity
Same concerns as before
Ex. Is reaction time an appropriate measure of learning?
video.edhole.com
Summary
Reliability and Validity are key concerns in psychological
research
Part of the problem in psychology is the lack of reliable
measures of the things we are interested in
1
Assuming that they are valid to begin with, we must always
press for more reliable measures if we are to progress
scientifically
This means letting go of supposed standards when they are no
longer as useful and look for ways to improve current ones
video.edhole.com

Potrebbero piacerti anche