Sei sulla pagina 1di 12

BINARY LOGISTICS

CASE 3 DATA
All predictors are metric or interval:- You can use two types of analysis

1) Discriminant Analysis: Has more discriminating power. Used when all the
assumptions of Regression are satisfied.
2) Regression: Can be used in more general situation (not normal data)

Errors provide: Type I and Type II errors

Parametric tests have more discriminating power than non-parametric tests.

y = a + b1 x1+ b2 x2 + ……….. + b9 x9

Here y is categorical. But y can have more than two categories yes no can’t say type.

You will again test the following:


1. Assumptions
2. Validity
3. Strength
4. Relevance
5. Hierarchy
6. Model for prediction

PROCESS:

Regression> Binary Logistic(0,1)/Multilogistic(0,1,2..)


Selection Variable: Used when you want separate results of different categories. For
example factors of males and females might differ. Or that of mothers and fathers might
differ.
Categorical : will be used in a situation where some or all of the predictors are
categorical(0,1 type non-metric in nature).
Save: Probability&group membership. Residuals.
Option:Classification plots. Hoosmer Lamenshaw.1(Ask other options)
Next>Previous ?
Dependent Variable Encoding

Original Value Internal Value


Not Preferred 0
Preferred 1

What is the significance of this table?


1
Goodness of Fit Tests. Measures of goodness of fit typically summarize the discrepancy between
observed values and the values expected under the model in question. Such measures can be used
in statistical hypothesis testing, e.g. to test for normality of residuals, to test whether two samples are drawn
from identical distributions (see Kolmogorov–Smirnov test), or whether outcome frequencies follow a
specified distribution (see Pearson's chi-square test).
Not preferred 0 and preferred 1 means that as x increases y meaning preferred would
increase. But if there was reverse coding it would mean that as x increases, y i.e. not
preferred would increase.

Validity of the Test


BLOCK 1
Hosmer & Lameshow Goodness of Fit

H0: Binary logistic Regression Model fit the data.

Sig p value .818> .05


Accept H0. Proposed Regression Model fit the data.

Conclusion: Model is valid

Strength of the Model

1)Omnibus Test of Model Coefficients.

H0: Model is insignificant.


Sig p value .000(Model)< .05 ( Reject H0).
Proxy R ^ 2. Since R squared or adjusted R squared cannot be calculated directly for
logistic regression( because assumption of regression metric data not followed) proxy R
squared values are taken.

2) Cox and Snell R squared and Nagelkere R squared.


These should be both above or both below 0.5.

0.541 try stepwise regression

3) Classification table
This gives the percentage of correctly predicted values. For eg. How many observed
preferred values are equal to expected preferred values? Diagonals give the correctly
predicted values(Hit ratio2).
Hit Ratio= 87% (in this case)
Thumb Rule=70%

MODEL

y= -14.914+1.023(count) + 0.053(price) +…….-0.159(taping)

The value will be PRE_1 is predicted or expected.

y=0--------0.5----------1
2
Hit Ratio=Sum of diagonals/Total No of Obs
0-not preferred
1- preferred

Respondent 1 Observed Preference=1


Predicted Preference=0

Thus it will lie in the category who preferred the diaper but actually the model has
predicted not preferred.

PREDICTORS ANALYSIS

Wald Statistics= (B/SE)^2

SE is the standard error

Exp(B) gives the mean of B.

H0: Regression coefficient is insignificant

β =(B-0)/SE(β )

Wald Statistic=β

Use this for determining the hierarchy.

ASSUMPTION
Corraltion Matrix for checking the mlticollinearity
FACTOR ANALYSIS
CASE 3(C)DATA

F1= k1+k2……………+k9
F2= l1+l2+l3………….+l9
F3=m1+m2+………….+m9
F4=n1+n2+……………+n9

Suppose F2 is related to x1, x3 and x4


U will see the following kind of data(standard factor loadings) if this is so
K1=.09
L1=.91
M1=.11
N1=.32

K2=.71
L2=.11
M2=.32
N2=.13

K3=.06
L3=.73
M3=.19
N3=.42

This means by deductive reasoning we can conclude that x1, x2 and x3 are mutually
related or x1, x2 & x3 are dependant.

It will never happen that u get the same variable in different factors.

Now instead of using the traditional explained variance we will use shared variance.

PROCESS

Data Reduction>Factor
Enter all variables you want to run the factor analysis on.

Selection Variable: Used when you want separate results of different categories. For
example factors of males and females might differ. Or that of mothers and fathers might
differ.
Descriptives: Select Coefficients, KMO and Bartletts, significance, univariate
descriptives.
Extraction: Method is Principal components3(Ask other methods), Select Scree Plot,
Extract Eigen values over 1.
Rotation: select varimax (IMP)
Scores: Save as variables
Ask Method: Regression, Bartlett, Anderson Rubin
Display factor score coefficient matrix.
Options: No changes
If you run the factor analysis you get three new columns FAC1_1, FAC2_1, FAC3_1

OUTPUT ANALYSIS

Correlation Matrix

If all non –diagonal values are below 0.2 or 0.3. No relational factor analysis is required.
Data is not suitable for factor analysis.

Note:It is only indicative/suggestive whether data is suitable for factor analysis. You
only get an impression which you have to confirm. Confirmation through KMO and
Bartletts Test.

KMO and Bartletts Test

1) Bartlett’s Test
To confirm whether the nine variables as a whole are suitable for Factor Analysis or
whether there is a perceptual dimension beneath the data we see the Bartletts Test.

H0: Correlation matrix is an identity matrix


[Identity Matrix is 1 × 1 Matrix with the diagonals as 1]

Only if Bartlett’s Test is rejected (at 0.05 sig) there is a significant relationship between
variables (data suitable for Factor Analysis). Here it is .000. so we reject H0.

2)KMO(or the Kaiser-Meyer-Olkin Measure):It gives the % of shared variation with


other variables(between 0& 1)

Close to 1 indicates more sharing of variation


KMO stats greater than 0.6 or 0.5 means data is suitable for factor analysis. Here it is .
803.

If the above two tests give conflicting results use the first one.

NOTE: The above tests give whether all nine variables as a whole are suitable or not. To
see whether which variables are related you see the COMMUNALITIES
3
Principal component analysis (PCA) involves a mathematical procedure that transforms a
number of possibly correlated variables into a smaller number of uncorrelated variables called
principal components. PCA was invented in 1901 by Karl Pearson.
COMMUNALITIES

It gives percentage of variation shared by individual variables with other variables. In


the table Initial Value is always 1 because both shared and not shared variances are taken.
But Extraction value is what we see for shared variances. If one or more communalities
come out to be low (thumb rule < 0.5) then you will exclude that factor and rerun the
Factor Analysis.

Table: Total Variance Explained

Here components refer to the factors. Total variance captured or extracted by


corresponding factor is given. Here in this table it is giving all the 9 components and the
shared variation of only three factors. Notice that if only three components were
ultimately extracted from the first column then why have they given all the variables in
the first column?

Answer: In this analysis we had selected the Extraction > Eigen values more than 1
Now suppose we see that the eigen value of the fourth component is 0.999, then actually
this component should also be a factor as it is important but it is not included as a factor.
Then in that case we rerun the factor analysis by Selecting>Extraction>No. of Factors =4

Percent Variation Extracted


The concept is that minimum factors should extract maximum variance.

Factor Analysis will be effective under two conditions:


a) Number of factors should be as low as possible.
b) Percent variance extracted should be as high as possible.

Here 9 variables have been reduced to three factors, thus capturing 84% of the variation
and sacrificing 16% of the variation.

There will always be a trade-off. The variance extracted should not be very less.

Thumb Value: The variance extracted should be at least 65%.

SCREE PLOT

The steep slope shows that more variance is captured. Scree plot suggests how many
factors should be there. And for this purpose elbow criteria is used (the total number of
factors that should be taken must coincide with the elbow.
Scree Plot

4
Eigenvalue

1 2 3 4 5 6 7 8 9

Component Number

i.

COMPONENT MATRIX & ROTATED COMPONENT MATRIX

These tables show the correlation between the factor & the variable. Principal component
matrix puts maximum shared variation in the first factor. Thus, all variables are assigned
to the first factor.

In component matrix since principal components is used the first factor will always have
the maximum variation captured so maximum number of variables will be included in it
& the interpretation of the factors becomes difficult. Thus component matrix should not
be used for analysis.
Intially the axes are orthogonal. F1 & F2
are perpendicular. But the axis are rotated
in such a way that the maximum variation
is captured. V1 &V3 both belonged to F1
along with V6 & V9. But as the axis are
rotated the V6 & V9 are moved to F2.
Also note that for the variance explained
table the total variance shared remains the
same in case of the component matrix as
well as the rotated component matrix. But
the variance extracted of each factor
changes in rotation.

Rotated Component Matrix(a)

Component
1 2 3
Count Per Box .224 .865 .251
Price Charged .193 .891 .243
Value .183 .862 .105
Unisex vs. Separate Sex .244 .266 .902
Style .237 .220 .916
Absorbency .850 .232 .256
Leakage .879 .182 .257
Comfort .863 .177 .157
Taping .768 .145 .079
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
a Rotation converged in 4 iterations.

From this table we can see that


F1= Absorbancy, Leakage,Comfort, Taping.
F2=Count, Price, Value
F3=unisex vs separate sex, Style.
We can assign names to these factors as there is an underlying dimension beneath the
factors.
F1=Utility/Functionality
F2=Value for money
F3=Design/Style quotient
DISCRIMINANT ANALYSIS

On the basis of certain predictors we try to differentiate categories.


It is a technique of analyzing marketing research data when the criterion or dependent
variable is categorical in nature whereas the predictor or independent variables are
interval in nature. If the dependent variable is not categorical convert it into categorical
not necessarily containing the same number of metrics.
1. Development of discriminant functions or linear combinations of independent
variables which will best discriminate between the categories of the criterion or
dependent variable.
2. Examination of whether significant difference exists among the group in terms of
the predictor variables.
3. Determination of which predictor variables contribute to most of the intergroup
differences.
4. Classification of the cases into one of the groups based on the values of the
predictor variables.
5. Evaluation of the accuracy of classification.

Two group Discriminant Analysis – Criterion Variable has two groups


Multiple Discriminant Analysis - More than two groups

BEHAVIORAL ANALYSIS

D = b0+ b1 x1+ b2 x2 + ……….. + b9 x9

D = Discriminant score
bs= discriminant loadings/coefficient or weight

In binary logistic analysis the normality assumption is violated.

Classify>Discriminanat>

Statistics
 Means
 Univariate ANOVA
 Boxes M
 Fisher’s
 Unstandardized
 Within group correlation
Classify
 Case-wise Results
 Summary Table
Save…
 Predicted Membership
 Discriminant Scores
 Probability Gp Members
Lets see what discriminant analysis does
Count
Pref- 4.12
Non Pref- 4.08

Style
Pref-4.10
Non Pref- 1.36

Style is playing an important role in discriminating but count is not.

Group statistics table suggests which groups are playing a role in discriminating. If there
is a large difference between the means of two groups then that variable is playing a role.

This table is not conclusive it is suggestive

H0:group means are equal


Mean pref= mean not pref

P< .05
The predictor is playing a role
All the predictors are playing a role together

WILK’S LAMBDA
total var= between grp+within grp

Wilks Lambda= within gp sum of squares/ between gp sum of squares

Lies between 0 and 1

Lower the value of lambda the more important it is for discrimination.

Tests of Equality of Group Means

Wilks'
Lambda F df1 df2 Sig.
Count Per Box .651 159.897 1 298 .000
Price Charged .702 126.733 1 298 .000
Value .780 83.930 1 298 .000
Unisex vs. Separate Sex .610 190.575 1 298 .000
Style .699 128.296 1 298 .000
Absorbency .670 146.588 1 298 .000
Leakage .690 133.692 1 298 .000
Comfort .784 82.343 1 298 .000
Taping .880 40.477 1 298 .000
Arrange
Unisex vs separate sex
Count per box
Absorbancy

There are two assumptions that need to be satisfied for discriminant analysis
1. Predictors should follow normal distribution
2. The groups should have equal var – cov matrix

H0= the var covar matrices are equal

You have to accept null hypotheisis


p>.05
Accept H0

Boxes test of equality of var – covar matrix

Test Results

Box's M 48.212
F Approx. 1.036
df1 45
df2 242753.94
4
Sig. .405
Tests null hypothesis of equal population covariance matrices.

0.405>0.05
Accept H0

Potrebbero piacerti anche