Sei sulla pagina 1di 20

Introduction to Factor Analysis

R.Venkatesakumar Department of Management Studies (SOM) Pondicherry University

Uniqueness of Factor Analysis


n

It is very unique in the sense that it is an 'interdependence' technique. It will not consider variables entered in the analysis as dependent or independent - instead considers all the variables in the analysis as inter-dependent

Factor Analysis

Factor Analysis

Factor Analysis

Objective of Factor Analysis


n

The primary purpose of Factor Analysis is to define the underlying structure in the data matrix or grouping of variables Hence factor analysis can be very useful to cull out from a large number of variables to set of 'representative -subset, which still possesses the characteristics of the original set of variables.
Factor Analysis 5

Factor analysis - Research Design


n

Specific questions such as


n n n n n

purpose/objective of the analysis type of the analysis variables considered in the analysis sample size requirements assessing the characteristics of the sample

End of Slide
Factor Analysis 6

Objective of the Analysis


n

Identification of structure through summarizing the data or Data reduction, from a larger set of variables to some manageable number of dimensions / Identifying representative set of variables
n n

by examining the correlation between the variables, or respondents, the structure is identified. 'data reduction - the factor analysis focuses on identifying the set of representative 'factors' lesser in number than the original number of variables Back
Factor Analysis 7

Creation of an entirely new set of variables

Cases Vs. Variables

Back
Factor Analysis 8

Issues related to variables


n

Variables
n

normally metric variables, that is either ratio scaled or interval scaled. sometimes, the non-metric variable especially dummy variables are also used. the specification of the variables to be included in the factor analysis is a crucial task.
Cont
Factor Analysis 9

Issues related to variables n n

include 5 or 7 variables to measure the same feature the strength/purpose of factor analysis is to find out the patterns among the variables. If the variables are conceptually defined one, then the derived factors contain more meaningful concepts remember that inclusion of irrelevant variables or inclusion of more number of variables really going to distort the results
Back
Factor Analysis 10

Issues related to sample size


n

Sample Size
n

preferable sample size for doing factor analysis should be 100 or larger Sometime a ratio of 10:1 (i.e., 10 observations per variable) or 20:1 are considered which would definitely improve the prediction power But dont attempt when the sample size is less than 50
Back
Factor Analysis 11

Step -3 Basic assumptions about data


n

Partial correlation between the variables n If partial correlation is low/smaller then the variables can be explained by the factors n otherwise there is no true factors exists and a factor analysis is inappropriate in that situation. n if partial correlation/anti-image correlation is high, then it is an indication of variables not suited for factor analysis
Factor Analysis 12

Measure of Sample Adequacy (MSA)


Multicollinearity - Assessed using MSA (measure of sampling adequacy). The MSA is measured by the Kaiser-Meyer-Olkin (KMO) statistics. As a measure of sampling adequacy, the KMO predicts if data are likely to factor well based on correlation and partial correlation. KMO can be used to identify which variables to drop from the factor analysis because they lack multicollinearity. There is a KMO statistic for each individual variable, and their sum is the KMO overall statistic.

Factor Analysis

13

Measure of Sample Adequacy (MSA)


n

This is another measure, which tries to quantify the inter-correlation among the variables The co-efficientranges from 0 to 1, with 1 stands for each variable is perfectly predictable by the other variable

First we apply the concept of MSA to individual variables and whichever is falling in the unacceptable range is getting eliminated one at a time until KMO overall rises above .50, and each individual variable KMO is above .50.
whichever variables qualify the criteria for to include in the test are considered for overall Measure of Sampling Adequacy test

Factor Analysis

14

Table Showing MSA Coefficients range & their interpretation


Range of the Coefficient Remark

0.80 or above 0.70 - 0.80 0.60 - 0.70 0.50 -0.60 <0.50

Meritorious Middling Mediocre Miserable Unacceptable Back


Factor Analysis 15

Testing Assumptions of Factor Analysis

There must be a strong conceptual foundation to support the

assumption that a structure does exist before the factor analysis is performed. indicates that sufficient correlations exist among the variables to proceed. for both the overall test and each individual variable. Variables with values less than .50 should be omitted from the factor analysis one at a time, with the smallest one being omitted each time.
Factor Analysis

A statistically significant Bartletts test of Sphericity (sig. > .05)

Measure of Sampling Adequacy (MSA) values must exceed .50

16

3 types of variances
(i) Common Variance, which is defined as the variance in a variable that is shared with all other variables in the procedure. (ii) Specific Variance, which is that variance associated with a specific variable and (iii) Error Variance, which is due to measurement error or unreliable responses from the respondents.
Factor Analysis 17

Extraction Method Determines the Types of Variance Carried into the Factor Matrix

Diagonal Value Unity (1) Communality

Variance
Total Variance Common Specific and Error

Variance extracted Variance not used

Factor Analysis

18

Factors Extractions - basic procedures


n

two basic extraction rules available for deriving factors


n n

(i) Common Factor Analysis (CFA) (ii) Principal ComponentAnalysis (PCA)

Factor Analysis

19

Method of Extraction
n

Principal Component Analysis tries to explain the total variance that is common variance and the extracted factors that explain the maximum amount of total variance in the variables. Principal components factor analysis inserts 1's on the diagonal of the correlation matrix, thus considering all of the available variance. Most appropriate when the concern is with deriving a minimum number of factors to explain a maximum portion of variance in the original variables, and the researcher knows the specific and error variances are small.

Factor Analysis

20

10

Common Factor Analysis


n

On the other hand Common Factor Analysis focuses on to explain the maximum amount of variance that is shared by all the variables in the analysis Common factor analysis only uses the common variance and places communality estimates on the diagonal ofthe correlation matrix. Most appropriate when there is a desire to reveal latent dimensions of the original variables and the researcher does not know about the nature of specific and error variance.
Factor Analysis 21

Number of Factors to be extracted


(i) Latent Root Criterion (ii) Percentage of Variance Criterion (iii) Scree Test and (iv) priori criterion
To understand these concepts, knowledge about 'Factor Loadings', 'Eigen Value/communalities' are required
Factor Analysis 22

11

Latent Root Criterion


Any individual factor should explain the variance of at least of a single variable if it is to be considered in the procedure.

Back
Factor Analysis 23

Percentage of Variance Criterion


proceed to extract factors until the pre-specified percentage of variance is achieved

Back
Factor Analysis 24

12

Scree Test
n

Scree test tries to identify number of factors that can be extracted before the dominance of unique variances

Back
Factor Analysis 25

Initial Communalities
n

It is the total amount of variance a variable shares with all the other variables in the analysis and used in the analysis. If we use Principal Component Analysis (PCA), the initial variance considered in the analysis will be one, indicates full variance in the variable is being used in the analysis
26

Factor Analysis

13

Factor Loadings
n

It is the correlation between the original variable and the factors. The amount of variance explained by the factor is square of the correlation (as in the case of coefficient of determination) The sum of square of factor loadings for a variable indicates the percentage of variance that has extracted by all the factors. This will be displayed as 'Final Communalities' in the results.

Factor Analysis

27

Eigen Values
n

If we square and sum across the variables for a factor the coefficient is known as 'Eigen Value' for that factor. The sum of the initial communalities will be named as 'Sum of Eigen Values, which would be equal to number of variables used in the analysis provided if we use Principal ComponentAnalysis. The ratio of Eigen values for a factor to sum of Eigen value represent the percentages of variance explained by that factor
Factor Analysis 28

14

Obtaining 'Un-rotated ' Solution


n

The factor matrix contains the loadings of each variable on the factors. The first factor tries to extract the maximum variance in all the variables (i.e., can be viewed as summary of best linear relationship exists in the data)
n

which makes things complicated for the researcher in interpretation stage.


Factor Analysis 29

Interpreting Factor Loadings


Factor Loading
-0.30 - +0.30 +0.40 to +0.50 -0.40 to -0.50 +0.50 to +1.00 -0.50 to -1.00

Remarks
Minimal More Important Loadings

Very Significant Loadings

Factor Analysis

30

15

However Sample size is critical in determining the loadings


Loading
0.30 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75

Sample Size Required


350 200 150 120 100 85 70 60 50

(Computation made with SOLO, Power Analysis, BMDP Statistical Software Inc. 1993)

Factor Analysis

31

Interpreting factor loadings


n

identify highest loading which are also significant loadings for each variable on the appropriate factor based on the sample size if all the variables have only one higher-significant loading on one particular factor, then the interpretation would be very simple ; those variables having higher significant loadings on one factor profiled with the characteristics of the those variables. if all the variables or most of the variables having higher significant loadings on a single same factor, then the interpretation becomes very difficult

Factor Analysis

32

16

Communalities
n

The communality for a variable is the amount of (percentage or fraction) variance that is explained by the retained factors. It is the sum of squares of loadings for each variable across the factors that are retained in the study. Lower the communality means the particular variable is not well captured by the factors
Factor Analysis 33

Rotation of factor matrix


n

rotation is a process by which the reference axis (Factor-1 axis, Factor -2 axis etc) are turned about the origin, until some other 'better position' is reached, with the objective that redistribute the variance from the earlier factor to later ones it will result with some of the variables will have higher loadings with only one factor and in the rest of the factors will have low loadings which may not be very significant one.
Factor Analysis 34

17

Types of Rotations
n

The rotation can be classified into 2 typesn n

(i) Orthogonal Rotation (ii) Oblique Rotation.


n

As the name suggests, while orthogonal rotation the angle between the reference axis maintained at 90 which is not so in the case of oblique rotation.

Factor Analysis

35

Orthogonal Factor Rotation


Unrotated Factor II +1.0 Rotated Factor II V1 +.50 V2

Unrotated Factor I -1.0 -.50 0 +.50 +1.0 V4 -.50 V5 V3 Rotated Factor I

-1.0
Factor Analysis 36

18

Oblique Factor Rotation


Unrotated Factor II +1.0 Orthogonal Rotation: Factor II V1 +.50 V2 Oblique Rotation: Factor II

Unrotated Factor I -1.0 -.50 0 +.50 +1.0 V3 V4 V5

-.50

Oblique Rotation: Factor I

Orthogonal Rotation: Factor I -1.0


Factor Analysis 37

Choosing Factor Rotation Methods

Orthogonal rotation methods:


o are the most widely used rotational methods. o are The preferred method when the research goal is data reduction to either a smaller number of variables or a set of uncorrelated measures for subsequent use in other multivariate techniques.

Oblique rotation methods:


o best suited to the goal of obtaining several theoretically meaningful factors or constructs because, realistically, very few constructs in the real world are uncorrelated.
Factor Analysis 38

19

Factor Analysis

39

20