Factor Analysis

Factor Analysis
19-2
What is factor analysis ?
 Factor analysis is a general name denoting a class of

Procedures primarily used for data reduction and
summarization.
 Variables are not classified as either dependent or

independent. Instead, the whole set of interdependent
relationships among variables is examined in order to
define a set of common dimensions called Factors.
19-3
Purpose of Factor Analysis
 To identify underlying dimensions called Factors, that

explain
the correlations among a set of variables.
-- lifestyle statements may be used to measure the
psychographic profile of consumers.
 To identify a new, smaller set of uncorrelated

variables to
replace the original set of correlated variables for
subsequent
analysis such as Regression or Discriminant Analysis.
-- psychographic factors may be used as
independent
variables to explain the difference between
loyal and
non loyal customers.
19-4
Assumptions
 Models are usually based on linear relationships
 Models assume that the data collected are interval scaled
 Multicollinearity in the data is desirable because the objective is to

identify interrelated set of variables.
 The data should be amenable for factor analysis. It should not be

such that a variable is only correlated with itself and no correlation
exists with any other variables. This is like an Identity Matrix.
Factor analysis cannot be done on such data.
19-5
An Example
A study conducted to determine customers perception and attributes
of an airline. A set of 10 statements were constructed and responden
were asked to rate in a 7 point scale
( 1= completely agree, 7 = completely disagree )
Statements were as follows:
1. The Airline is always on time
2. The seats are very comfortable
3. I love the food they provide
4. Their air-hostesses are very courteous
5. My boss/friend flies with the same airline
6. The airlines have younger aircrafts
7. I get the advantage of a frequent flyer
program
8. It suits my schedule
9. My mom feels safe when I fly in this airline
10. Flying by this airline compliments my lifestyle
and
19-6
Example Contd..
 Do the ten different statements
indicate 10 different factors which
influence a customer to fly by this
airline ?
OR
 Is there any correlations between these
statements so that we can identify only

a few factors such that some of these
statements can be associated to these
factors.
19-7
Factor Analysis – basic ideas

Each of the statement indicated in the example is considered as a
Variable. Hence for each respondent there will be a score against
each variable.
Ex: V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
respondent 1 2 2 4 3 5 3 5 7 6 2
We can attach suitable weights to each of the variable scores and a

Weighted sum of these can be calculated.
Ex: weight for V1 = 0.3 , weight for V2 = 0.1 etc
Hence a score called Factor Score can be calculated as
Factor Score ( Resp 1) = W1x2 + W2x2+ W3x4+w4x3+……….
Similarly factor score can be calculated for each respondent.

If there were 20 respondents, we would get a table containing
20 factor scores.
Factor Analysis – basic ideas 19-8
contd
 The weights which are assigned to each of the variables are not
taken arbitrarily but are chosen such that the variance in the
factor scores obtained is the maximum.
 Once the first set of weights are obtained, a new set of weights
are obtained so that the new set of factor scores shows the
maximum variance but keeping in mind that these set of factor
scores are uncorrelated with the first set of factor scores.
 This process is repeated till all the variance is explained by these

factors.
 The first set of factor scores obtained is now correlated with

the data for the variable 1 to 10 . This is called factor loadings
Thus factor loading is the correlation between the factor scores
and the variables.
Factor Analysis – basic ideas 19-9
contd
An example would clarify what we have discussed so

far.
A file in excel data sheet can now be looked at to

understand
what we have just discussed.
The factors thus extracted are done using a technique called

Principal – Component Analysis.
Determining the number of 19-10
factors
 It is possible to extract as many factors as there
are variables but the very purpose of factor
analysis will be defeated and hence a smaller
number of factors need to be found.
Question is --- how many?
Several procedures are available:
-- Determine based on Eigenvalues.

An eigenvalue represents the amount of
variance associated with the factor. Generally
only factors with an Eigenvalue of >1.0 is
included.
factors
 Determination based on Scree Plot.
A scree plot is a plot of the eigenvalues against

the number of factors. Typically, the plot has a
distinct break with a gradual trailing off with the
rest of the factors. This trailing off is referred to as
Scree.
19-12
Scree Plot
3.0
Eigenvalue 2.5
2.0
1.5
1.0
0.5
0.0
1 2 4 5 6
3Component Number
factors
 Determination based on percentage of Variance.
The number of factors extracted is determined so that the cumulative

percentage of variance reaches a satisfactory level.
The amount of variance explained can vary with situation but
above 60% is considered satisfactory.
How to check suitability for Factor 19-14
Analysis
 Kaiser-Meyer-Olkin ( KMO ) measure of sampling
adequacy . This index compares the magnitude
of observed correlation coefficients to the
magnitude of partial correlation coefficients.
Typically it should be
> 0.5 is considered as good enough for
conducting
Factor analysis for the data under consideration.
 Bartlett test of sphericity : It is a test used to

examine the hypothesis that the variables are
uncorrelated in the population. If the hypothesis
can be rejected then the data is suitable for
factor analysis.
19-15
Conducting Factor Analysis

RESPONDENT
NUMBER V1 V2 V3 V4 V5 V6
1 7
.00 3.00 6.00 4
.00 2.00 4.00
2 1
.00 3.00 2.00 4
.00 5.00 4.00
3 6
.00 2.00 7.00 4
.00 1.00 3.00
4 4
.00 5.00 4.00 6
.00 2.00 5.00
5 1
.00 2.00 2.00 3
.00 6.00 2.00
6 6
.00 3.00 6.00 4
.00 2.00 4.00
7 5
.00 3.00 6.00 3
.00 4.00 3.00
8 6
.00 4.00 7.00 4
.00 1.00 4.00
9 3
.00 4.00 2.00 3
.00 6.00 3.00
1
0 2
.00 6.00 2.00 6
.00 7.00 6.00
1
1 6
.00 4.00 7.00 3
.00 2.00 3.00
1
2 2
.00 3.00 1.00 4
.00 5.00 4.00
1
3 7
.00 2.00 6.00 4
.00 1.00 3.00
1
4 4
.00 6.00 4.00 5
.00 3.00 6.00
1
5 1
.00 3.00 2.00 2
.00 6.00 4.00
1
6 6
.00 4.00 6.00 3
.00 3.00 4.00
1
7 5
.00 3.00 6.00 3
.00 3.00 4.00
1
8 7
.00 3.00 7.00 4
.00 1.00 4.00
1
9 2
.00 4.00 3.00 3
.00 6.00 3.00
2
0 3
.00 5.00 3.00 6
.00 4.00 6.00
2
1 1
.00 3.00 2.00 3
.00 5.00 3.00
2
2 5
.00 4.00 5.00 4
.00 2.00 4.00
2
3 2
.00 2.00 1.00 5
.00 4.00 4.00
2
4 4
.00 6.00 4.00 6
.00 4.00 7.00
2
5 6
.00 5.00 4.00 2
.00 1.00 4.00
2
6 3
.00 5.00 4.00 6
.00 4.00 7.00
2
7 4
.00 4.00 7.00 2
.00 2.00 5.00
2
8 3
.00 7.00 2.00 6
.00 4.00 3.00
2
9 4
.00 6.00 3.00 7
.00 2.00 7.00
3
0 2
.00 3.00 2.00 4
.00 7.00 2.00
19-16
Correlation Matrix
Variables
V1 1.0
V2 -0.5
Results of Principal Components 19-17
Analysis
Communalities
Variables I nit
V1 1.0
I nitial Eigen values
V2 1.0
V3 1.0
Analysis
Extraction Sums of
Factor Eigen
Factor Matrix value
1 2.731
2
Variables 2.218
Fa
19-19
Rotate Factors
 Although the initial or unrotated factor matrix
indicates the relationship between the factors and
individual variables, it seldom results in factors that
can be interpreted, because the factors are
correlated with many variables. Therefore, through
rotation the factor matrix is transformed into a
simpler one that is easier to interpret.
 In rotating the factors, we would like each factor to
have nonzero, or significant, loadings or coefficients
for only some of the variables. Likewise, we would
like each variable to have nonzero or significant
loadings with only a few factors, if possible with
only one.
 The rotation is called orthogonal rotation if the
axes are maintained at right angles.
19-20
Rotate Factors
 The most commonly used method for rotation
is the varimax procedure. This is an
orthogonal method of rotation that minimizes
the number of variables with high loadings on
a factor, thereby enhancing the interpretability
of the factors. Orthogonal rotation results in
factors that are uncorrelated.
 The rotation is called oblique rotation when
the axes are not maintained at right angles,
and the factors are correlated. Sometimes,
allowing for correlations among factors can
simplify the factor pattern matrix. Oblique
rotation should be used when factors in the
population are likely to be strongly correlated.
Analysis
Rotation Sums of S
Factor Eigenvalu
Rotated1 Factor M
2.68
2 2.26
Variables F
V1
19-22
Interpret Factors
 A factor can then be interpreted in terms of

the variables that load high on it.
 Another useful aid in interpretation is to plot

the variables, using the factor loadings as
coordinates. Variables at the end of an axis
are those that have high loadings on only that
factor, and hence describe the factor.
19-23
Factor Loading Plot

Rotated Component Matrix
factor
Variable 1 2
Factor Plot in Rotated Space
Factor 1 V1 0.962 -2.66E-
02
1.0 V4 ∗∗ ∗ V6 V2 -5.72E-02 0.848
V2
V3 0.934 -0.146
0.5 V4 -9.83E-02 0.854
Factor
V5 -0.933 -8.40E-
0.0
V1∗ 02
∗ V5 V3∗ V6 8.337E-02 0.885
2
-0.5
-1.0
-1.0 -0.5 0.0 0.5 1.0

19-24
A few examples
We can now take few examples

with hypothetical data and run
factor analysis using SPSS package.
19-25

Factor Analysis

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Factor Analysis

Caricato da

Copyright:

Formati disponibili

Factor Analysis

What is factor analysis ?

 Factor analysis is a general name denoting a class of

 Variables are not classified as either dependent or

Purpose of Factor Analysis

 To identify underlying dimensions called Factors, that

 To identify a new, smaller set of uncorrelated

 Models are usually based on linear relationships

 Models assume that the data collected are interval scaled

 Multicollinearity in the data is desirable because the objective is to

 The data should be amenable for factor analysis. It should not be

statements so that we can identify only

Factor Analysis – basic ideas

We can attach suitable weights to each of the variable scores and a

Factor Score ( Resp 1) = W1x2 + W2x2+ W3x4+w4x3+……….

Similarly factor score can be calculated for each respondent.

 This process is repeated till all the variance is explained by these

 The first set of factor scores obtained is now correlated with

An example would clarify what we have discussed so

A file in excel data sheet can now be looked at to

The factors thus extracted are done using a technique called

Several procedures are available:

-- Determine based on Eigenvalues.

 Determination based on Scree Plot.

A scree plot is a plot of the eigenvalues against

 Determination based on percentage of Variance.

The number of factors extracted is determined so that the cumulative

 Bartlett test of sphericity : It is a test used to

Conducting Factor Analysis

 A factor can then be interpreted in terms of

 Another useful aid in interpretation is to plot

Factor Loading Plot

-1.0 -0.5 0.0 0.5 1.0

We can now take few examples

Potrebbero piacerti anche