Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
In promotion management,
1. What are key success factors in choosing media vehicles? 2. What makes difference between successful and inefficient sales reps? 3. What attributes are important in consumers adoption of coupons? 4. What makes difference between efficient and inefficient messages? 5. What attributes are important to create a successful display? 6. etc.
Discriminant Analysis
technique for analyzing marketing research data when the dependent variable is categorical and the independent variables are metric in nature. It develops a set of criteria that can be used to separate objects into groups such that each object is more like other objects in its group than like objects outside the group. Objects can be either variables or observations.
Objectives
a) development of linear combinations of the independent variables, which will best discriminate between the categories of the dependent variable (groups); examination of whether significant differences exist among the groups, in terms of the independent variables;
b)
c)
d)
e)
determination of which independent variables contribute to most of the inter-group differences; classification of cases to one of the groups based on the values of the independent variables; evaluation of the accuracy of classification, etc.
Z Discriminant Function A B
Z Discriminant Function
X2
A B
A B Discriminant Function
X1 Z
Two-Group Discriminant Analysis: when the dependent variable has two categories, it derives only one discriminant function; Multiple Discriminant Analysis: when the dependent variable has three or more categories, it derives more than one discriminant function.
Examples: Gender Male vs. Female Heavy Users vs. Light Users Purchasers vs. Non-purchasers Good Credit Risk vs. Poor Credit Risk Member vs. Non-Member Attorney, Physician or Professor
6 7 8 9 10
Group 1 - Definitely switch Respondent 1 2 3 4 5 Group 2 - Undecided 6 7 8 9 10 Group 3 - Definitely not switch 11 12 13 14 15 Price competitiveness 2 1 3 2 2 Service level 2 2 2 1 3
4 4 5 5 5
2 3 1 2 3
2 3 4 5 5
6 6 6 6 7
Selection of dependent and independent variables. Sample size (total & per variable). Sample division for validation.
= to use the metric scale responses to develop nonmetric categories. For example, use a question asking the typical number of soft drinks consumed per week and develop a threethree-category variable of 0 drinks for non-users, 1 non 5 for light users, and 5 or more for heavy users. = compares only the extreme two groups and excludes the middle group(s).
The dependent variable must be nonmetric, representing groups of objects that are expected to differ on the independent variables. Choose a dependent variable that: best represents group differences of interest, defines groups that are substantially different, and minimizes the number of categories while still meeting the research objectives. In converting metric variables to a non-metric scale for use as the dependent variable, consider using extreme groups to maximize the group differences. Independent variables must identify differences between at least two groups to be of any use in discriminant analysis.
continued . . .
The sample size must be large enough to: have at least one more observation per group than the number of independent variables, but striving for at least 20 cases per group. have 20 cases per independent variable, with a minimum recommended level of 7 observations per variable. have a large enough sample to divide it into an estimation and holdout sample, each meeting the above requirements. Assess the equality of covariance matrices with the Boxs M test, but apply a conservative significance level of .01. Examine the independent variables for univariate normality. Multicollinearity among the independent variables can markedly reduce the estimated impact of independent variables in the derived discriminant function(s).
Key Assumptions
Multivariate normality of the independent variables. Equal variance and covariance for the groups.
Other Assumptions
Minimal multicollinearity among independent variables. Group sample sizes relatively equal. Linear relationships. Elimination of outliers.
Model
Data Structure Each object is characterized by a set of measurements. We also know to which group each object belongs. e.g., object 1: (x11, x12, , x1m) object 2: (x21, x22, , x2m) Group I Group II
10
Index Function: D = b0 + b1 x1 + b2 x2 + b3 x3 + + bm xm , where D is discriminant score; b's are discriminant coefficients or weights; x's are independent variables. Note that D and b's are NOT observed. The method of finding the weights (bj): a set of weights will be generated in order to maximize the ratio of between-group variation and within-group variation. Thus, variation in D between the two groups is as large as possible, while the variation in D within the group is as small as possible.
Related Questions?
how well does the discriminant function classify the sample? is the discriminant function statistically significant? what are the relative importance of the independent variables? etc.
11
c)
Dcs
n2 D1 n1 D2 n1 n2
d) e)
f)
estimate the discriminant score D for each respondent; compare an individual discriminant score D with the cutting score Dcs and assign the respondent to one of the two groups; construct the confusion matrix (2 x 2 table) of which the diagonal shows the hit rate (the proportion correctly classified) and evaluate the model.
12
The ratio of the within-groups sum of squares to the total sum of squares
Corresponding F-Stat
Ho: The Equality of Group Means Ha: Group means are not equal
13
14