Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
missing values, highlight missing values for which the missing values are to
be analyzed. Transfer them to the right box. In method, select series,
mean, select ok. The software will create one additional variable.
Operation 2: If you would want to analyze the data, in a split file command,
which is generally used for data which are based on specific categories.
E.g. 4 Cities- Delhi, Mumbai, Kolkata, Chennai or gender based or
company wise data etc. Go to data, click split file, click on compare groups
& in the window named as groups based on put the appropriate category,
for e.g. in our research, we may enter gender as the group. Click Ok to
continue.
* To unsplit the file, click on data, split file, analyze all cases, do not create
groups, click Ok & continue.
Operation 3: To calculate descriptive statistics, SPSS gives you 3 options.
Option 1- Click analyze, click descriptive, click frequencies, highlight the
variable for which descriptive statistics are to be calculated. Transfer to the
right window. Click on statistics & select required options. Click continue,
click Ok & your output is ready.
Option 2- Go to analyze, descriptive statistics, descriptives, options, select
required options. Transfer the required variable from left to right. Click Ok to
get your output.
Option 3- Go to analyze, compare means, click on means, transfer the
required variable into the dependent list, click on options, click the required
operations, transfer them to the right. Click continue, click Ok.
Operation 4: Go to transform, click visual binning, select the required
variable, transfer to the right, click continue, click make cutpoints, click in
first cutpoint location, put the appropriate value, click on width, put your
interval size(eg.5, 10,15 etc), click on number of cutpoints, click on apply,
click on make labels, click in binned variable & give appropriate name to a
new variable, click Ok to give the output.
compared with value 0.05 lies in the acceptance region because 0.401 >
0.05. Hence, we accept our null hypothesis & conclude that there is no
significant difference in the variance of 2 cities.
Since, our hypothesis is accepted we continue with equal variances
assumed. Hence, we take 0.010 as our significance value for the t-test
when comparing it with value 0.05 at 5% level of significance. The
significance value 0.01 < 0.05. Hence, we reject our null hypothesis &
conclude that the perception of the brand in the 2 cities is significantly
different.
-------------------------------------------------------------------------------------------------Paired Sample T-test:
When we want to find out, attitude of people, for e.g. towards a particular
brand before & after an advertising campaign, in such cases we use paired
sample t-tests. Let us assume that we have used a sample of 18
respondents before & after an advertising campaign. They each were
asked to rate on a 10-point scale, their attitudes towards the brand, where
1 represents brand is highly disliked & 10 represents brand is highly liked.
Following is the data collected after the survey.
Sr. No.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Before
3
4
2
5
3
4
5
3
4
2
2
4
1
3
After
5
6
6
7
8
4
6
7
5
4
6
7
4
6
15
16
17
18
6
3
2
3
8
4
5
6
Purchase Intention
1
2
3
4
5
None
Low
High
Very High
Certain
Chi-square value for independence helps us test the hypothesis if the two
variables A & B are independent of each other. In the given example, we
want to test the hypothesis, is the purchase intent independent of income
group.
Ho= Purchase intent is independent of income group
When you request the software in the same output for a chi-square test, it
gives you the value of chi-square test of independence. In the given output,
our chi-square value is 0.097. When we compare this value with alpha
value 0.05, it is higher than the value & hence we accept our hypothesis
& conclude that purchase intent is independent of the income group.
ANOVA:
Completely randomized design in one-way ANOVA. This design is used
when there is only one independent categorical variable & one dependent
variable. Each category of an independent variable is called as a level. The
independent variable maybe different levels of price, packages, different
colors & the dependent variable can be effect on sales.
Case 1: An advertising company has 3 different versions of an ad copy for
a campaign. These 3 versions can be called as copy 1, 2, 3. Now the ad
agency wants to test which of these versions of the ad copy is preferred by
its target population before they launch the ad campaign. They have
collected responses of 18 respondents from the target population in the
nearby areas of the city, such that, these 18 respondents were assigned to
3 versions of the copy. Each version is thus shown to six respondents. The
respondents are asked to rate their liking for the ad copy on the scale of 110 such that 1= not liked at all & 10= liked a lot.
Go to analyze, compare means, one-way anova, put ratings in dependent
list, ad copy as factor variable, click ok.
Ho= There is no significant difference given to all the 3 ads (i.e. A= B= C)
Ha= At least one of the ad is significantly different.
At 5% level of significance, the value is 0.05 & the significance value is
0.203 which is greater than the value 0.05. Hence we accept our null
hypothesis & conclude that there is no significant difference in the mean
ratings of all the 3 ads.
Randomized Block Design:
In the given case above, if we make slight changes that is we add a block
variable (magazine) does it create change in the perception of respondents
towards the ad copy. These 3 versions of the ad copies were each used in
six different magazines. These magazines are coded as 1, 2, 3, 4, 5, and 6.
Out of the people who saw these ads, 18 randomly chosen respondents
are picked up, each of whom as seen a particular version of ad. Thus we
finally have 1 respondent who has seen the given version of ad in the given
magazine. In other words, we have one respondent for every combination
of magazine & ad copy.
1st Hypothesis:
Ho= There is no significant difference in the mean ratings given to all the 3
ads.
Ha= At least one ad is significantly different than the others.
2nd Hypothesis:
Ho= The block/magazine doesnt create a significant change in the
perception of people.
Ha= At least one is significantly different.
Go to analyze, general linear model, univariate, put ratings in the
dependent variable, put ad copy as fixed factor, magazine as random
factor, and click Ok.
At 5% level of significance, our significance value 0.005 is lesser than
value 0.05. Hence we reject our null hypothesis & conclude that at least
mean ratings of 1 of the ad copies is significantly different.
Hypothesis 2: At 5% level of significance, our significance value is 0.000
which is lesser than the value 0.05. Hence we reject our null hypothesis &
conclude that magazine creates a significant difference in the perception of
people towards the ads which can be further confirmed from rejection of
hypothesis 1which was accepted in one-way ANOVA.
Factorial design with 2 or more factors:
This type of design is employed when we have 2 or more independent
variables or factors. The major advantage of this design is that multiple
factors can be simultaneously tested. In such a design, there are 2 kinds of
effects that can be simultaneously tested. One is called as the main effect
& the second is called as the interaction effect.
When you look at the correlations of all the variables with each other, the
values in correlation are standardized between -1 to +1. Looking at the last
column, we can find that except for competition index, all the other
variables are highly correlated with sales & range from 0.732 to 0.95. This
means we have chosen a fairly good set of independent variables. Only the
index of competitors activity doesnt appear to be strongly correlated with
sales as its correlation coefficient is -0.5. We must also note that these are
one-to-one correlations of each variable with sales & each other. So we
may still want to do a multiple regression with these independent variables
because its possible that in the presence of other variables, this
independent variable may become a good predictor of dependent variable.
The other observation from the table, we need to see is, whether
independent variables are highly correlated with each other. If they are as
in this case, it indicates that we may not be able to use all of them & may
be able to use only 1 or 2 & we land up eliminating few independent
variables.
The result of the regression model gives us the coefficients of the model
which are also called as the B-list. A (intercept)= -3.1728, B1= 0.2268, B2=
0.819, B3= 1.09, B4= -1.89, B5= -0.54 & B6= 0.065. When you substitute
these values in the equation, you get the following equation:
Sales = -3.17+0.23(potential) +0.82(Dealers) +1.09(sales people)1.89(competition activity) -0.55(service centers) + 0.07(existing customers)
Before we use this equation, we need to check the statistical significance of
the model & the r2 value. From the ANOVA table, the p-level is seen as
0.00004. This indicates that the model is statistically significant at a
confidence level of 1- 0.00004 or 99.99% level of confidence.
The r2 value is 0.97 which indicates 97.7% of the moment in sales can be
predicted by the given independent variables. We also note that the
significance of individual independent variable indicates that at the
significant level of 0.10(equivalent to confidence level of 90%) only
potential & people are statistically significant in the model. The other 4
independent variables are individually not significant. However, from the
time being, we shall use the model as it is & try to apply it for decision
making. The real use of regression model would be to try & predict sales in
lakhs, given all the independent variables values or check the impact of a
change in some of them on the sales figure of a territory. The equation we
obtain simply means the sales will increase in territory if the potential
increases, no. of dealers increases, level of competition decreases, the no.
of service people decreases & if the no. of existing customers increases.
The estimated increase in sales for every unit increase or decrease in
these variables is given by coefficients of respective variables for instance,
if the no of sales people is increased by 1, the sales will increase by 1.09
lakhs. Similarly, if 1 more dealer is added, we expect the sales to increase
by 82,000. If other variables are kept constant, the variable service does
not make too much of sense. If we increase the amount of service centers,
according to the output, sales is estimated to decrease by 55,000. When
we look at the significance level of service, we find that this variable is
insignificant. Hence, to put this variable as a part of the regression model
would be unwise.