Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
ILLINOIS LIBRARY
AT URBANACHAMPAIGN
BOOKSTACKS
and imderiinlng of books ore raosons and may reutlt In dismissal from
AUG
When
University of
Urbana-Champaign
http://www.archive.org/details/factoranalysisin18well
Faculty
Working Papers
College of
Commerce and
Business Administration
No.
18
\'Jhen
it works well,
it also points
to interesting relationships that might not have been obvious from examination
First,
(but not necessarily most important) it can point out the "latent" -- that is,
"underlying," not directly observed -- dimensions that account for the relationships among product preferences or other kinds of ratings obtained in market
studies.
For example, suppose one has no idea of the dimensions consumers use
If respondents are asked to rate several
types of liquors (Scotch, gin, rum, bourbon, vodka, etc.) according to preference,
a factor analysis may reveal some salient characteristics of liquors that under-
of some help to
tlie
The second way factor analysis can be helpful is by pointing out relationships among observed values that v?ere there all the time but not easy to see.
For example, a factor analysis of cosmatic use once suggested that hair spray is
v;ith
than with other products women use specifically for their hair.
The authors wish to express their great appreciation for coirjr.ents on earlier drafts of this chapter made by Harry Roberts and Douglas Tigert of the Graduate School of Business, University of Chicago. For the occupational data used in the first part of the chapter we arc indebted to Douglas Tigert.
that hair spray, eye shadow and lipstick belong in a group of purchases and
activities that include the number of movies attended in the past month.
This
A lipstick instead of a
coinb as a
latent dimension and the unforeseen relationship are perhaps of more value than
the specific immediate application, but on an exceptionally good day one can have
all three.
Third, factor analysis is useful vhen data must be condensed and simplified.
Suppose, for example, that some television comjnercials have been rated on 50 or
several groups of scales, are heavily correlated with one another because they
are in fact redundant, factor analysis will suimnarize the information in them and
It is
coi-ncs
in handiest when the marketing vice-president says, "Don't give me ten pages of
numbers.
Finally, and related to the third use, factor analysis can be employed as
one step in empirical clustering of products, media, stimuli or people.
In the
possibly
nonobvious types.
Wh at
v
Ir.
Factor Analysis?
Factor annlysis is
tlie
predictor (independent)
that is a function of some underlying, latent and hypothetical set 'of factors.
Conversely, one can look at each factor as a dependent variable that is a function
of the observed variables.
Several methods of factor analysis are available, and these several ncthods
do not necessarily give
tlie
same results.
Vocabulary
Factor analysis has some specialized concepts and terminology.
A factor
For
-4-
TADLE
Varlablc
1
Correlat ions
2 3
Ar.onc.
Vari ables
7
8 28
1.
Author, Fiction
76
48
47
20
19
08
07
13
25
05
10 09
18
2.
76 48
20
47 19 07
25
22 53
30
31
21
26
3.
Newspaper Reporter
22 22
13
A. Computer Programner
5.
42 42
53
00
-01 08
20
09
31
33
18 33
Bookkeeper
08
25 05
28 18
36
36
6. 7.
25 10 30
21
22
09
31
00
-01
08
31
45
45
34
34
8.
9.
Doctor
Lab Technician
20
33
09
IB
48
48
26
33
large sample
(850) of hoinomakers.
eath occupation in terms of how well she thought she would do in it if she had
the opportunity to build a career in that field.
2
(author,
(author, children's
The entry 20 in Row 1, coIukji 4 neann that homemakers who thought they
would do well as an author of fiction had a much weaker tendency to say they would do well aa a computer programmer (occupation 4).
It is obvious from inspection alone that variables 1, 2 and 3 have some-
thing in ccnznon that they do not chare with the other variables.
It is also
This sort of pattern in raw data leads to the inference that one under-
5-
data arc not usually so striking as in this example contrived for clarity of
exposition.
The data are real, but they were especially selected so as to mrike
the relationship between the raw data and the factor solution intuitively obvious.
v/ill
be given later.
us.ed
methods
TABLE
(decimals omitted)
Variable
1.
Loadinf,s on Factors
Author, fiction
^089
'
B 08
07
C -05
-10
-17
h^ (communal ity) 82
2. 3.
80
53
69 57
62
69
14
15 81
4.
5.
6.
Computer programmer
Bookkeeper
Coll math teacher
Nurse
-08
-01
76
73
03
20
-02
28
13
-20
-83
-77
7 8.
9.
-13
14 36
70
69 61
.
Doctor
Lab technician
-68
Sura
of squares (eigenvalue)
223
197
183
603
-6-
The entries in
Coluirins A,
They show
how/ closely the 9 variables are related to each of the three underlying factors,
and they are the key to understanding what the factors mean.
heavy loadings of Autlior, Fiction; Author, Children's books; and Newspaper re-
As long as all
the signs in a column are changed, it is the absolute siiie of the loadings,
The h
means that not much of the variable is left over after whatever the factors represent is taken into consideration.
factors taken together do not account for much of whatever the variable is all
about.
or cif;Gnvaluc
each factor in accounting for the particular set of variables being analyzed.
is sometimes implied, mistakenly, that the eigenvalue indicates importance in
one specific solution method) and can easily be changed by changing either the
The entry at the far right of the eigenvalue row (603 in this case) is
highly redundant groups, and if the extracted factors account for all the groups,
the index will approach unity.
procedures.
^-Jhile
cedures provide a large number of options to the analyst to suit the method to
his -purposes.
Correlation, Covr.risnee or
Cro.':r.-Pro.''-riCts
>tatrlx ?
Although it is most
fltandard scores in wliich the aversges of all variables arc set equal to zero and
the variances equal to one, factor analysis of correlations loses two of the
three types of information contained in a data matrix, naroely the levels and
disperoiono of variables.
matrix (only moans sot equal to zero, but variance not standardized) or
croso-
i-natrix
of correlations.
if the units of tseasurcinent are identical or very siiuilar across variables, and If individual differences are expected, It may be better to factor analyze a
while some received relatively low average ratings, or if there had been much
more disagrccnant about some occupations than others, our yoe of a correlation
matrix ao a otarting point vould have erased this possibly useful information.
Moral:
the moat widely used factor analytic procedures ignore level differences
If these differences are
tho.
Corralntlon M-itrix ?
If the researcher
9-
seeras
to be
(unity).
now so much in vogue that it is the standard option employed in many of the widely used canned computer programs.
Rotatio n:
from the initial
r;,ost
tlie
microscope slide.
thouj;);
actually there.
Different rotations give entirely different appearing results.
From a
statistical point of view, all results arc equal, none superior or inferior to
others; but from the point of view of making sense of the results of factor
3.
The "un-
rotated factor loadings matrix"-- shown here. for the nine variable occupation
factor loadings will show that the pattern is not at all clear.
the right,
the factors have been rotated by the varimax method, a procedure that
produces, within each factor, as many high loadings and as many low loadings as
possible.
10-
TABLE
COMPAP.ISON OF
IT:,-ROTATI:o
(decimals omitted)
Variable
Author, Fiction
A
1.
B
-52
-51
C
31
A
90
89
69 14
-02
B
08
07 15
h^
68
69
-05
-01
-17
82
2. Author,
3.
Children's Books
26
80
53
80
53
69
57
Newspaper reporter
Co:iiputer prograraaer
63 58 38 63
-32
53
59
16
27 29
16
4.
5
.
69
57
81
-08
03
Bookkeeper
Coll math teacher
76
73
-13
13 36
6.
7
.
44
-04 -04
22
62
20
-02
-20
-83
-77
62
Nurse
33
65.
-77
70
69
61
70
69
8.
9.
Doctor
Lab teclmician
-52
-^'.2
28
13
62
-68
61
312
151
139
602
223
197
183
602
Note that the concnunality for each variable rersains the same regardless of
rotation.
This is another
v.-ay
of saying tlint the rotated factors taken together of each variable as the unrotatcd factors do.
sar.ie
air.ount
It is just that the "v/eight?'of each Vt'.ririblc on each factor is nov; redistributed.
Tlie
surr.s
loadings.
variables 4,
ti'.o
11-
When
factored, as ts often the case in inarkcting recearch, the analysis will extroct
the larjjest and most interesting combinations of variables first and then proceed
to smaller combinations.
v;as
serving food.
The next was a group of foods used heavily by relatively lew incone
families, and fhe next was a group of products that are supposed to germproof and
but as the analysis proceeded the groups became smaller and less understandable,
until finally the "groups" consisted of only one of the remaining products each.
It is
exceedingly v-aste-
ful of computer time, and it obscures the meaning of the findings because it
affects
tlie
rotation adversely,
V.'hen
rotation, the tendency is to produce rotated factors that have very high loadings
on a very few variables.
fev; factor.s
are rotated,
quite
v:hen
-12-
how many factors he wants to get out, he can have the analysis stopped after
the desired number of factors have been extracted.
Iii
situation is rare.
much of
each variable the factors can explain (also a rare privilege in mai-Ucting research),
he can stop wlicn that criterion is reached.
Most
coir.rnonly,
however, if he does
not know very much about his data to begin with, he will want to keep factoring
"
.
If^after a
certain number of factors have been extracted* the eigenvalue of the next factor
drops to a sharply lower level,
the.
factor with the low eigenvalue may be disSecond, when all factors
vjIio-sc
eigenvalues
are greater than unity have been extracted, the factoring may be stopped.
lias
corv.e
knov.' \vhy
scientific
I'letliod.
VJhat
to correlate,
tlje
the diagonal,
make.
more variables,
and more realistic in the sense that the pattern in the input variables is not
13-
so self-evident as tc vas in
tlie
In this
Kev; Yorlier
cor.^mon
than do nuslncss
and Life
at least in
Note that
tlie
and therefore, cy^'balling the matrix docs not show any obvious simple groups.
When the rolation.'jhips are as complex and as numerous as these (and this is
simple and small matrix compared to some
tliat
here)
(p.
15)
the side.
The entries in
eac)i
tl-.e
columns are the factor loadings, the corThis is an unrotaLed matrix and,
relation between
as usual,
The rotation
has clarified the factors, and they can now be interpreted as follows:
(Insert Table
6 here.)
(p.
16)
Factor
^^^ Hot Rod.
has high loadings on Car and Drive r, Road and T rack, Motor Trend
This means that respondents who say they read Car and Dri ver
tlicy
1.
In
5
-lA.
o
t
o
.^
^
r^
o
I
^
f^
O s
t
O 8 3
2
3
1
<
^ o
o
I
i-t
o
I
N o
o T^
o
I
'^
9
O o
-
s S
8
& 8
s
3 s 8
^#
s
1
s
^
to
O 8
S
O
00
-*
*M
O O
^^
<-
3
:;^
-I
1
-<
s R 8
1
8
rt
8
tn
a
O
1
8
3
1
(Sf
s
o*
O
1
g
O
o O
1^
f^
2
O
O
3
O
3
S
O
S
1^
B
t
s
NO
i
t~*
s
1
r\
o
*-4 -
o
^
o
00
o o
r-r
3
O
6
lA
O
t
W
o
-#
in
1
o o
o o
-4
O
A
s o
S
o
2
ry
3
\)
2
^ O
8
wt
2
r-t
o
1
o
1
o o
o
1
^ 3
o
n
o
g
so
o
r^
*-
o
r*
*-
CO
*n
nJ W
.M
<-H
(n
fio
--^
f*j
r-
f^j
8
I
n 8' 3
.-
W
^ o
^
rt
rt
f:
O
lA
s
o
o o
in
t-4
o
1
3
\0
^ o n
v*
w
o
cJ
1
a
1
O
o>*
tA
3 M
o o
1
g
(>4
8
2
ir*
2
3
2 8
2
o
tr.
3
A
8
-(
r*
2
-*
3
(O
r\
i-f -<
o
I
O
1
s
??
r^
r>
o\
-fi
lA
-
o
5
o
1
S
>*
g
* **
w
"^
O ^
3
3
r-<
.-*
o
.- Csl
o
f-J
<r **
s
O
3 3
8
*n s
-i
t^
--4
W
(->
*^
s o
CO
O -
M -*
-4
o
O
o
o
o
*^
r>
^ O
'->
>r<
e-4
to
\0
^r^
O
1
g
o
o o 3
f^
u>
tO
::
S
1
O W
sO .
3
(M
i-i
-
*v/
s
s
s
o ^
rv
t*\
::
t^
ft.
8
.
5
If
"
Z>
S h
"
"
&
Oi
u
(
^
k.
i-
3
i:
PRIK'CIPAT.
(dccima Is
Variable
Dt>scr <pt: on
oniiLtiL>ci)
,
Var. No.
Factor
1 2 3
i-t
10
Corrmunallty
Bui. Week
35
43
26
51
-12
07
04
06
16
-28
33
-33 -03
04
17
15
21
00
-28
-11
56
61 52 55
Life
AS
30
37
-27
19
-05
17
Kew Vorkcr
Time
-21
-19
-13
00
05 11
4
5
45
38
-04
05
-12
05
12
-08
05
-32
-20
33
-21
32
Newsweek
U.S. News & World Rpt.
S.it.
43
-03
-10
58
24
37
11
15
32
23
00
04 41 38
-12
41
18
14 32
18
47
57
Review
25
36
18
-10
15
-03
16
12
-01
22
14
-20
-02 -21 04
61 58
Look
Sat. Ev.
45
39
-20
-23
08
-05
Post
20
36
20
-07
19
23 -22
-16
-37
58
47 64
Forbes
10
11
12
20
45
21 37
01
-31
23 18
19
-05 -49
-14
17
Argosy
Atl. Mthly.
-13
35
-18
23
17
-07
-19
35
-21
10
05
36 01
-10 -46
46
49
-17
-24
04
-25
65
63
Car
6.
D'iver
13
-38
-21
-05
U
15 16
17
50
-02
63
14
-43
32
-14
09 -19 22
-04
-11
00
-06
-32
-27
18
04
75
77
Farm
Jrnl-.
-02
29
17
57
-09
44
35
-01 -15
-13 -04
-08
For tune
-12 -32
24
-25
28
18 31 33 01
07
-16
19
58
63 63 75 69
67
Harpers
Kech. lUus.
Pop. Mech.
Pop.
-21
29
18 19
-31
-09 -03
-06
-04
-29
-16
-08
-01
01
02
57
-30
-23
-24 -08
25
26
36
-40 -36
-19
-04
-05
13
06
-01 01
Sci.
20
21 22
50
-10
43 30
29
-10
-00
44
Outdoor Life
Prog. Parmer
43
-03
21
-44
13
00
02
-24
02 02
04
-30
13
46 45
63
Reader's Dg3t.
Road
ft
23 24
25
-16
19
40
-09
-09 -18
11
-23
11
Track
42 26
-01
-37
18
-41
21
-10
-35
04
Sci. Atner.
-16
48
22
54
-10
25
-12
12
45
;6
-22
58
63
Succ. Frmng.
26
27
-11
-13
-01
-10
22
Sports Afield
True
Hot Rod
44
-23
-11
43
24
05
-39 -13 04 04
-10
15 18
05 03
03
64
65 51 65
28 29 30
45
36
48
-22
15 21
-12 -12
-52
-05
09
-41 -42
36
-37
00
03
-05
-01
Motor Trend
16
-13
06
Xlgenvslues
413
292
225
169
150
144
125
106
101
100
1825
-16TA)-.LE
ROTATED 1?ACX0RJVa'RI>:
(decipials oinitted)
variable Description
Bus. Week Var. No.
Factor
1
9'
10
Conmunnl ity
05
07
51
06
04
-lU
-06
-11
02
03
01
-15 -01
21
26 08
-06
09
-07
li
56
61
Life
20
37
li
iw
19
12
New Yorker
Time
3 A
5
00
-00
OU 02 09
05
31
-05
-01 04
53
07
52
65
-06 -09
-07 05
08
55 59
Newsweek
U.S. Neva & World Rpt.
Sat. Review
03
10
20
20
U
-06
-11
68
6
7
03
04 06
05
07 17
12
00
-03
05
07
12
07
17
03 09 09
14
-02
69
72 10
57
-03 -05
03 64
09 -01 03
02
26
13 11 15
61 58 58 47 64
Look
Sat. Ev. Post
71
00
03
08
10
20
-01 09
07 13
01
74
08
12
Forbes
10
11
12
-00
17
-06
12
-07
01
07
-03
75
Argosy
Atl. Mthly.
00
08
04
02
10
01 78
02
78
02
01
07 04
03
65 63
75
77
Car i Driver
Fid. U Stream
Parra Jrnl.
13
04 04 -03
72
-02
-04
-02
-02
-06
14
U
15
05
01
84
09
-02
01 07
10 -02 -01
-00
-04
19
04
09
05
-02
-00
07
Fortune Harpers
Hech. Illus.
Pop. Mech. Pop. Scl.
16
17
03
08 26 16
12
-00
-03 -01
14
15
-03
-01
58
15 10
-01
15
19
-09
72 82
75
-01
-02
09
63 63
18 19
09 03 05
01
07
-03
-01
03
00
-04
04
61 15
00
-07
01
04
10 01 03
05
04 13
05
05
75
69
67
20
21
00 01
07
12
03
81
13
Outdoor Life
Prog. Parraer
02
80
07
-02 -25
34
22
04
05
18
02
-07
-13
36
01
14 12 07
47
Reader's Dgst. 23
Road
(x
24
77
-14
10
01
21
-08
10
45
63
58
Trcck
24
03 08
-03
06
-05 -08
03
13
04
-01
-07
Scl. Aner.
25 26
27
07 01 08
-00
75 10
06
-06
03
10
16
-04 -03
74
-01
17
Succ. Frnng.
-11
07
-01
01 04 -03
63 64
65
51
Sporto Afield
-03
05
-07
78
23
-00
02 05
06
Trua
Hot Rod
28
29
06
68
77
-03
-01
75
13 12
-06
01
-01
-04
01
Motor Trend
30
-03
05
03
20
04
03
04
65
Sums of Squares
250
190
215
178
188
206
145
152
147
154
1823
and
i)
^0
17-
sorae
degree of
comni. a
Factor
Bu sine so
Time and
Wo.r k.
more cudience overlap with each other than with magazines that loadhighly
other factors.
Factor
Afield
.
Fannint; and
P rogressiva Ffa-mor
Tlius,
factor, looking for high loadings to determine what the various factors "mciiu."
The Meanii
x^'.
of "Loadin
g','.'
c-nd
magazines arc If.rge -.bile the loadings of the other tcagaxines are small.
rsGul.t is
This
exactly what
Other rotation
different
^..ma
;.nrotated matrix.
3
indivjducl
if',.
(niaga;:iucs
Forh^
c-crve
If the
In the present example, the cement that glues the groups is presumably editorial
content vriYich makes megazincs within groups appealing to somewhat the same group
of readers.
'
-18-
Instead its positive loadings are divided cnong the news group
aiid
I'ew:;\-?f-ek
,
fam
lacgar.ines.
broad appcal-is not unlimited -- that in fact men who are heavy readers of the
ir.cgaisina
Comnmnal ity
The
colur.ai
shows the degree to which the factors account for or "explain" each of the
variables.
Thus the factors extracted in thic analysis account for the reported
sorr.evhcit
readership of Life
of Ruflincss Uaek
Pro?^rcssive
,
and they account for Otitd oor Life better than they account for
or Read er
n
Ffirrr.er
Dlp;ef; t
index to hou much of the variable is in a sense "left over" after what it has
in ccnunon with other variables has been accounted for.
Tlie
comparatively lov
coramunality of Reader
'f.
Dl.t^
In coxmon with the other magazines included in this analysis, while the rela-
tively high communalities of Field and Stream and Farm Journal s hows that they have
t^'.ken
as
p-.rou pg
19-
Factor Scorer.
degree to which each rer^pondeut gets high scores on the t;roup of iccms that
load high on each factor.
magazines in Table
ranging from "never read" to "read four issues out of the last four."
puter vjould determine each respondent's score on Factor
1
The com-
tlie
would be deter-
these magazines, he would have a high score on this factor; if he were a light
(or non-) reader of these magazines, his score on Factor
1
would be low.
Tir:-G
p.
high Factor
Factor
2
scores.
with other variables, they can help explain what the factors mean.
if the interpretation of Factors
1
v;ho
ore sig-
nificantly younger than respondents who score high on Factor 2, and that respondents
20-
while
referred to earlier, for example, the underlying difference betv;een two grou'ps
of cosmetics became obvious
v;)icn
In
tv;o
vjas
clarified
v;as
were obvious once the factor scores had pointed them out, but without the factor
scores to serve as a guide through the tangle of correlations betx^-een products and
factor scores makes it possible to say what types of TV programs are viev7ed by
Commen ts o n Q Analysis
The analysis just described is R-type factor analysis, by far the most
common.
21-
on variable
variable
pattern of responses on all the vari ables is much like respondent 2's pattern
of responses.
v,-ichin
groups of people.
Q analysis is useful when
tiie
Vc'.riables
It is th.crefore beinj
CurrcnLly
available Q analysis con-.putcr programs do not handle even moderately large sai^ples
of respondents easily, and reliability tests have suggested that Q factors
ar...
that these problems vjill be overcome or at least better understood as time goes
by, and that Q analysis or one of its mathematical relatives will become a
Ca utions
a
Cost
prodigious
studies employing fifty or m.ore variables v;ere alm.ost never attempted, and even
much sn.allcr studios required so m:any hours of labor on a hand calculator that
they vara seldom replicated, checked for reliability or even examined for arith-
-22-
metic errors.
Perhaps it was the heroic amount of effort required that led some.
of the early analysts to believe that their work had revealed the Truth.
Although coitiputerized factor analysis is now much faster and much easier,
it is still not costless.
with the number of respondents, and it increases much faster than linearly
the number of variables.
number of respondents.
'
variables to an already large analysis is apt to increase the cost far faster
than it increases value.
Rel iabil ity
.
starts
v.'ith
the sample, clianges in data [gathering procedures or any of the numerous Icinds of
The results of
any sinj^le analysis are therefore always less than perfectly dependable.
that the analyst is tempted to say to himself, "What's interesting about tliis?
I
(He didn't).
or even stability.
make some sense when stared at long enough and hopefully enough^
Soelberg, 1968.)
placed beside a somewhat different but equally plausible solution computed from
-23-
tv;ice.
At
a minimum, divide the respondents into tvjo groups at random by means of a rando'i
tlie
other.
UTien
The sampling problem extends to the set of variables used in the analysis.
It should be obvious that a diir.cnsion cannot cmcrs^e from a factor analysis unless at least
t'.vO
It is perhaps
less
obvious that putting variables in and taking them out vill influence the patterns
formed by other variables.
If some variables are added that have a strong relah.ave
tionship v;ith
so;T.e
loadin[^>,s
Judgment
It
lias
cleaners--you don't
Iiave
what
v;as
done to the suits or the data as long as they come back clean and free
It should be clear by now tb.at the problem is not that simple.
v.'ill
from v;rinkles.
The user of factor analysis maizes decisions that determine how the analysis
come out, or else
tlie
liim.
v/no
results of any one factor analysis will be }(evealed Universal Truth Forevcr
Enduring, and it has sometimes led to disappointment and even indignant rejection
of the method y
(Ehrenberg, 1968.)
24-
fairly complicated tools that may help untangle badly tangled data, the user
is much less liable to feel cheated when he tries to line up the results of a
factor analysis
When
that
might not
liave
And
^J
Refsrences
Armstrong, J. S. and P. Soelbcrg, "On the Interpretation of Factor Analysis," PRVcholo^ic al P-uUetin Vol. 70, 1968, pp. 361-364.
.
Cattell, R.
B.
(c<l.) H andbook of H- iltivar j.at e Kxnerimental Psycholop.y Rand KcNally & Co., Chicago, 1966.
,
Ehrenberg, A.
S. C. "On Methods: The Factor Analytic Search for Progr-rn Types," Journal of Advert Lslnp, R&f^cgrch Vol. 8, (March 19(jU)
,
,
pp.
55-70.
,
Rarman,
H.
Modern Factor Analysis 2nd Edition, The University of Chicago Preee, Chicago, 1967.
Horst, P., F/'ctor A naly.'-.io of Da ta Matricon, Holt, Rinehart and VJinston, llcv York, 1965.
Nunnally, J. C.
Ramond, C.
K.
,
PcvchoRi.--
trie Theory
"Factor Analysis: V^li^n to Use It," in Shuchman, A., ed., Scient ific Dgciflon ): n kln<t in r'uninas.q Holt, Rinehart and U Winston, Inc., Hew York, 1963.
,
Vol.
Cues on Q-Tcclmiqite " Journ al of Advert is in.w; (Soptcmbcr 1969), pp. 53-60.
,
Stevenson,
VJ.
Chicago:
University of Chicago
5S7
W;