Sei sulla pagina 1di 19

FACTOR ANALYSIS

A successful analytical tool for real world.

Presented by:
Gaurav Mittal
MBA
NIT-Trichy
Factor
Analysis
It is desirable to collect as much data
as possible in a research to find out a
result.
As the number of variables increases
the number of correlations will
increase faster than that.
Then the problem of comprehending
those variables into a manageable
number comes into picture. This
task of data reduction or
summarization of data is achieved by
• Factor Analysis is a multivariate
statistical technique, which is used for
data reduction by identifying an
underlying structure in the data.
• It is used as a data reduction technique
and at the same time we can maintain
“as much” of the original information as
possible. That is the variance in the
original data (say 100%) can be
explained by an optimum number of
reduced variables to an extent (say
around 80%).
• But when we want to explain 100%
To reduce a large number of variables to a
smaller number of factors for modeling
purposes, where the large number of
variables precludes modeling all the
measures individually.
To establish that multiple tests measure the
same factor, thereby giving justification for
administering fewer tests.
To validate a scale or index by
demonstrating that its constituent items
load on the same factor, and to drop
To create a set of factors to be
treated as uncorrelated variables as
one approach to handling
multicollinearity in such procedures
as multiple regression.
To identify clusters of cases and/or
outliers.
To determine network groups by
determining which sets of people
cluster together.
In general the process of factor analysis can
be divided into three major steps

Formulation of the data set.

Estimation of correlation/covariance matrix

Extraction and rotation of factors


• i) Formulation of the data set: Data
set is to be formulated in accordance to
the objective of the research. The scale
used in the variables must be an
interval scale or ratio scale. It is better
to take a sample size of about 4 or 5
times the number of variables. Though
it is not mandatory.
• ii) Formulation of Correlation or
covariance matrix: The data set in
the above step is converted into a
correlation or a covariance matrix.
Here we will see how to form a
correlation matrix in our example. A
correlation matrix is the matrix showing
how the variables are correlated and
respond with each other.
• iii)Method of Extraction:
There are various methods of extracting
factors from the correlation/covariance
matrix. They are
 
 Principal component analysis
 Common factor analysis
– Principal factor analysis
– Maximum likelihood method
– Alpha method
– Image factoring method
– Unweighted least square method
– Generalized least square method
• principal component analysis and
common factor analysis differ in terms
of their conceptual underpinnings. The
factors produced by PCA are
conceptualized as being linear
combinations of the variables whereas
the factors produced by CFA are
conceptualized as being latent (hidden,
concealed) variables.
• PCA is generally preferred for purposes
of data reduction (translating variable
space into optimal factor space), while
CFA is generally preferred when the
research purpose is detecting data
• The factor analysis model can be expressed in the
matrix notation:
x = Лf+U .where
Λ = {l ij} is a p ´ k matrix of constants, called the matrix
of factors loadings.
f = random vector representing the k common factors.
U = random vector representing p unique factors
associated with the original variables.

• The common factors F1, F2, …,Fk are common to all X


variables, and are assumed to have mean=0 and
variance =1.. The unique factors are unique to Xi. The
unique factors are also assumed to have mean=0 and
are uncorrelated to the common factors.
• Equivalently, the covariance matrix S can be q
decomposed into a factor covariance matrix and

2 an
σ λ
2 2
error covariance matrix: i − Ψi = ij
j =1
• S = Л Л T + Ψ where
• The factor loadings are the correlation
coefficients between the variables and
factors.
• The sum of the squared factor loadings for
all factors for a given variable is the
variance in that variable accounted for by
all the factors, and this is called the
communality.
• The factor analysis model does not extract
all the variance; it extracts only that
proportion of variance, which is due to the
common factors and shared by several
items.
• The proportion of variance that is unique to
• After extraction of the factors one needs to
discriminate and say that these variables come
under these factors.
• One factor can explain the variance in data
which was there by more than one variable, but
the variance in one variable should be
explained by one factor.
• To achieve this end we will go for rotation of
factors. These are basically two broad
categories of rotation – orthogonal rotation and
oblique rotation.
• The goal of these rotation strategies is to obtain
a clear pattern of loadings, i .e., the factors are
somehow clearly marked by high loadings for
Aim to reduce the value of θ to zero.

[ ( )
2 2 p ∑ x hr − x hs x hr x hs − ∑ x hr − x hs
2 2
( 2 2
)( 2∑ x x ) ]
[( ] − {[∑( x }
hr hs
tan 4θ =
p ∑ x hr − x hs
2
)
2 2
− ( 2 x hr x hs )
2
hr
2
− x hs
2
)] − 2( ∑ x x
2
hr hs ) 2

Sign of numerator
Sign of denominator
+ -
+ I. < IV. - <
- II. < III. - -
Take an example where you have 6
variables.
Namely
Ability to define problems
Ability to supervise others
Ability to make decisions
Ability to build consensus
Ability to facilitate decision-making
Ability to work on a team

And assume that you have some raw data


taken from the survey. Then go as shown
in the next slides.
Find the Eigen Values of the
correlation of the variables.

Here since Eigen Values are greater


than one, we will take two factors for
representing the five variables.
The factors extracted may contain the
error such that we don’t know which
factor is actually explaining the which
Rotation of factors gives us the
actual variables explained by the
factor.
Now we have a highly interpretable
solution, which represents almost 90% of
the data.
The next step is to name the factors.
There are a few rules suggested by
methodologists:
Factor names should
be brief, one or two words
communicate the nature of the underlying construct

First factor can be named as “ability to


take judgment” , and the second as
“ability to perceive others”.
Thank you

Potrebbero piacerti anche