Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
7
Scaling
Nominal Ordinal
Scales Scales
Interval Ratio
Scales Scales
Measurement and Scaling
1. Nominal scales
2. Ordinal scales
3. Interval scales
4. Ratio scales
4
SCALES
Males = 1, Females = 2
Sales Zone A = Islamabad, Sales Zone B = Rawalpindi
Drink A = Pepsi Cola, Drink B = 7-Up, Drink C = Miranda
6
Measurement and Scaling
7
Measurement and Scaling
An Interval Scale allows us to perform certain arithmetical
operations on the data collected from respondents. This scale
measure the distance between any two points on the scale
It taps the differences and the magnitudes of the differences in
the variable----Example:
8
Measurement and Scaling
9
Measurement and Scaling
Numerical Descriptive
Type of Scale
Operation Statistics
Frequency in each
Nominal Counting category, percentage in
each category, mode
Median, range,
Ordinal Rank Ordering
percentile ranking
Arithmetic Operations on
Mean, standard
Interval Intervals between
deviation, variance
numbers
10
Four Scales of Measurement;
Finish
Nominal Numbers
7 8 3
Assigned
to Runners
Finish
OrdinalRank Order of
Winners
Third Second First
place place place
Interval Performance
8.2 9.1 9.6
Rating on a
0 to 10 Scale
Ratio Time to
Finish, in 15.2 14.1 13.4
Seconds
Classification of Scaling Techniques;
Scales
Fixed sum
Graphic
rating
Classification of Scaling Techniques;
Scales
Nominal Ordinal Interval Ratio
Likert
Semantic
differential
Numerical
Itemized rating
Staple
Classification of Scaling Techniques;
Scales
Numerical
Itemized rating
Staple
Four Scales of Measurement;
Example.
Please indicate your current martial status.
__Married __ Single __ Single, never married __ Widowed
Four Scales of Measurement;
Example.
Which one statement best describes your opinion of an Intel PC
processor?
__ Higher than AMD’s PC processor
__ About the same as AMD’s PC processor
__ Lower than AMD’s PC processor
Four Scales of Measurement;
Example.
How likely are you to recommend the new phone to a friend?
Definitely will not Definitely will
1 2 3 4 5 6 7
Four Scales of Measurement;
Example 1.
Please circle the number of children under 18 years of age
currently living in your household.
0 1 2 3 4 5 6 7 (if more than 7, please specify ___.)
Chapter 7
MEASUREMENT:
SCALING, RELIABILITY,
VALIDITY
Methods of Scaling;
Rating scales
Ranking scales
Dichotomous scale
Is used to obtain a Yes or No answer.
Nominal scale
Yes
No
Rating Scales Formats;
Category scale
Uses multiple items to elicit a single response.
Nominal scale
Rating Scales Formats;
a) Durability [ ] [ ] [ ] [ ] [ ]
b) Fuel consumption [ ] [ ] [ ] [ ] [ ]
Rating Scales Formats;
Example:
Please rate brand A on each of the following dimensions:
poor excellent
a) Durability [ ] [ ]
b) Fuel consumption [ ] [ ]
Rating Scales Formats;
Likert scale
Is designed to examine how strongly subjects
agree or disagree with statements on a
5-point scale.
Interval scale
Rating Scales Formats;
Likert scale
My work is very interesting
Strongly disagree
Disagree
Neither agree nor disagree
Agree
Strongly agree
Rating Scales Formats;
Example:
Please rate car model A on each of the following dimensions:
Durable ---:-X-:---:---:---:---:--- Not durable
Low fuel consumption ---:---:---:---:---:-X-:--- High fuel consumption
Rating Scales Formats;
Numerical scale
Similar to the semantic differential scale, with the difference
that numbers on a 5-point or 7-point scale are provided, with
bipolar adjectives at both ends.
Interval scale
Poor Excellent
Durability 1 2 3 4 5 6 7
Durability 1 2 3 4 5 6 7
Rating Scales Formats;
1 2 3 4 5
Very Unlikely Unlikely Neither Unlikely Likely Very Likely
Nor Likely
Rating Scales Formats;
Stapel scale
This scale simultaneously measure both the direction and
intensity of the attitude toward the items under study.
A simplified version of the semantic differential scale in which
a single adjective or descriptive phrase is used instead of
bipolar adjectives.
Interval data
Model A
-3 -2 -1 Durable Car 1 2 3
-3 -2 -1 Good Fuel Conaumption 1 2 3
Rating Scales Formats;
The Stapel scale is a unipolar rating scale with ten categories
numbered from -5 to +5, without a neutral point (zero). This scale
is usually presented vertically.
SEARS
+5 +5
+4 +4
+3 +3
+2 +2X
+1 +1
HIGH QUALITY POOR SERVICE
-1 -1
-2 -2
-3 -3
-4X -4
-5 -5
Ordinal scale
Rating Scales Formats;
Paired Comparison
Used when, among a small number of objects, respondents are
asked to choose between two objects at a time.
Example; Choose any combination
Paired Comparison
Used when, among a small number of objects, respondents are
asked to choose between two objects at a time.
Example; Choose any combination
Paired Comparison
Used when, among a small number of objects, respondents are
asked to choose between two objects at a time.
Example; Choose any combination
Paired Comparison
Used when, among a small number of objects, respondents are
asked to choose between two objects at a time.
Example; Choose any combination
Forced Choice
Enable respondents to rank objects relative to one another,
among the alternatives provided.
Ranking Scales Formats;
Forced Choice
Ranking Scales Formats;
Comparative Scale
Provides a benchmark or a point of reference to assess
attitudes toward the current object, event, or situation under
study.
Ranking Scales Formats;
Comparative Scale
Characteristics Different Types of Rating Scales
Rating Scale Subject must: Advantages Disadvantages
2.Category scale Indicate a response Flexible, easy to respond Ambiguous items, few
category categories, only gross
distinction.
3. Likert scale Evaluate statements on Easiest scale to Hard to judge what a
a 5-point scale construct single score means
4. Semantic differential Choose points between Easy to construct, norms Bipolar adjectives must
and numerical scales bipolar adjectives on exist for comparison, e.g. be found, data may be
relative dimensions profile analysis ordinal, not interval
5. Constant sum scale Divide a construct sum Scale approximates an Difficult for respondents
among response interval measure with low education
alternatives levels
6. Stapel scale Choose point on scale Easier to construct than Endpoints are
with 1 center adjective semantic differential numerical, not verbal.
7. Graphic scale Choose a point on a Visual impact, unlimited No standard answers
continuum scale points
8. Graphic scale-picture Choose a visual picture Visual impact, easy for Hard to attach a verbal
response poor readers explanation to response
Goodness of Measures;
Goodness of Measures
Understanding Validity and Reliability
Figure 8.1 Illustrations of Possible Reliability and Validity Situations in
Measurement
Validity
(are we
measuring
the right
thing?)
Goodness of Measures
It is important to make sure that the instrument that we develop to
measure a particular concept is indeed accurately measuring the
variable, and that in fact, we are actually measuring the concept
that we set out to measure.
Item Analysis
Item analysis is done to see if the items in the instrument belong
there or not.
Each item is examined for its ability to discriminate between those
subjects whose total scores are high, and those will low scores.
In item analysis, the means between the high-score group and the
low-score group are tested to detect significant differences
through the t-values.
The items with a high t-value (test which is able to identify the
highly discriminating items in the instrument) are then included in
the instrument.
Goodness of Measures;
Reliability
The reliability of a measure indicates the extent to which it
is without bias (error free) and hence ensures consistent
measurement across time and across the various items in
the instrument.
In other words, the reliability of a measure is an indication
of the stability and consistency with which the instrument
measures the concept and helps to assess the “goodness”
of a measure.
Goodness of Measures;
Stability of Measures
The ability of a measure to remain the same over time —despite
uncontrollable testing conditions or the state of the respondents
themselves—is indicative of its stability and low vulnerability to
changes in the situation.
This attests to its “goodness” because the concept is stably
measured, no matter when it is done. Two tests of stability are
test-retest reliability and parallel-form reliability.
Testing Goodness of Measures: Forms of Reliability and Validity.
Test-retest reliability
Stability
Parallel-form reliability
Reliability
(accuracy in
measurement) Interitem consistency reliability
Consistency
Goodness Split-half reliability
of data
Validity
(are we
measuring
the right
thing?)
Test-Retest Reliability
The reliability coefficient obtained with a repetition of the same
measure on a second occasion is called test-retest reliability.
Parallel-Form Reliability
When responses on two comparable sets of measures tapping
the same construct are highly correlated, we have parallel-form
reliability.
Both forms have similar items and the same response format, the
only changes being the wordings and the order or sequence of
the questions.
What we try to establish here is the error variability resulting
from wording and ordering of the questions.
If two such comparable forms are highly correlated the measures
are reasonably reliable.
Goodness of Measures;
Goodness of Measures;
Split-Half Reliability
Split-half reliability reflects the correlations between two halves
of an instrument.
The estimates would vary depending on how the items in the
measure are split into two halves.
Split-half reliabilities could be higher than Cronbach’s alpha only
in the circumstance of there being more than one underlying
response dimension tapped by the measure and when certain
other conditions are met as well.
Hence, in almost all cases, Cronbach’s alpha can be considered
a perfectly adequate index of the interitem consistency reliability.
Understanding Validity and Reliability
Goodness of Measures;
5. Validity
Several types of validity tests are used to test the goodness of measures and
writers use different terms to denote them. For the sake of clarity, we may
group validity tests under three broad headings: content validity,
criterion-related validity, and construct validity.
5.1 Content Validity
Content validity ensures that the measure includes an adequate and
representative set of items that tap the concept. The more the scale items
represent the domain or universe of the concept being measured, the greater
the content validity. To put it differently, content validity is a function of how
well the dimensions and elements of a concept have been delineated.
Criterion-Related Validity
Criterion-related validity is established when the measure differentiates
individuals on a criterion it is expected to predict. This can be done by
establishing con-current validity or predictive validity, as explained below.
Concurrent validity is established when the scale discriminates individuals
who are known to be different; that is, they should score differently on the
instrument as in the example that follows.
Goodness of Measures;
Convergent validity is established when the scores obtained with two different
instruments measuring the same concept are highly correlated.
Validity Description
Content validity Does the measure adequately measure the concept?
Face validity Do “experts” validate that the instrument measures what its
name suggests it measure?
Criterion-related validity Does the measure differentiate in a manner that helps to
predict a criterion variable?
Concurrent validity Does the measure differentiate in a manner that helps to
predict a criterion variable currently?
Predictive validity Does the measure differentiate individuals in a manner as to
help predict a future criterion?
Construct validity Does the instrument tap the concept as theorized?
Convergent validity Does the measure have low correlation with a variable
That is supposed to be unrelated to this variable?
Goodness of Measures
Reliability
Indicates the extent to which it is without bias (error
free) and hence ensures consistent measurement
across time and across the various items in the
instrument.
Goodness of Measures-Reliability
Stability of measures:
Test-retest reliability
Parallel-form reliability
Correlation
Internal consistency of measures:
Interitem consistency reliability
Cronbach’s alpha
Split-half reliability
Correlation
Goodness of Measures-Validity
Validity
Ensures the ability of a scale to measure the intended concept.
Content validity
Criterion related validity
Construct validity
Goodness of Measures-Validity
Content validity
Ensures that the measure includes an adequate and
representative set of items that tap the concept.
A panel of judges
Goodness of Measures-Validity
Construct validity
Testifies to how well the results obtained from the use of the
measure fit the theories around which the test is designed.
Convergent validity: established when the scores obtained
with two different instrument measuring the same concept are
highly correlated
Discriminant validity: established when, based on theory, two
variables are predicted to be uncorrelated, and the scores
obtained by measuring them are indeed empirically found to
be so
Correlation, factor analysis, convergent-discriminant
techniques, multitrait-multimethod analysis
Figure 8.1 Illustrations of Possible Reliability and Validity Situations in
Measurement
Validity
(are we
measuring
the right
thing?)