Sei sulla pagina 1di 13

Principal Component

Analysis (PCA )
Noor Mohamed 15BM60072
Aparjeet Kausal 15BM60073
Introduction

It is used to find a few easily interpreted variables that


summarize most of the variation in the original data.
The chapter describes how to use the Excel Solver to perform
Principal Component Analysis
Prior to this, we should know that how to compute the
variance of a linear combination of variables and the
covariance between two linear combinations of variables
Principal Component Analysis
When we come across data sets with large amounts of variable.
It can get overwhelming quickly.
For e.g.: Daily returns for the last 20 years on all 30 stocks in
the Dow Jones Index
It contains a lot of variables, making the data to difficult to
understand.
We can construct a few easily interpreted factors to help to
understand the nature of the variability inherent in the original
data set.
daily stock returns might be summarized by component
reflecting the overall stock index, a component reflecting
movement in the financial sector and a component reflecting
movement in the manufacturing sector
Linear Combinations, Variances and
Covariance
Auto attribute data

20 people were asked


to rate on a 1 - 5 scale
(5= highly important
and 1 = not important)
in their purchase
decision
Sample variances and standard
deviations

Standardized
important
ratings

Computing 2FE HP and PR+2HP


Matrix Form
Properties
Length of each PC is normalized to 1
Each pair of PC has 0 sample covariance (orthogonal).
Orthogonality of the PC ensures that PC will represent different
aspects of the variability in the data.
Sum of the variances of the PC equals n, the number of
variables. Because each of the standardized variable has a
variance 1, this means the PC decompose the total variance of
the n standardized variables. If PCs are created from a sample
covariance matrix, then sum of the variances of the PC =
sum of the sample variances of the n variables.
Given the previous restrictions, the first PC is chosen to have the
maximum possible variance. After determining the first principal
component, you choose second PC to be the maximum variance
linear combination of unit length that is orthogonal to the first
PC and so on.
Finding First principal component
Finding second Principal component
Finding all Principal Component
Communalities

Potrebbero piacerti anche