Sei sulla pagina 1di 48

Data Analysis

Frequency Distribution
In a frequency distribution, one variable is
considered at a time.
A frequency distribution for a variable produces a
table of frequency counts, percentages, and
cumulative percentages for all the values associated
with that variable.
Statistics Associated with Frequency
Distribution
Measures of Location
The mean, or average value, is the most commonly used
measure of central tendency. The mean, ,is given by
X
n
X = S X /n
i
i= 1
Where,
Xi = Observed values of the variable X
n = Number of observations (sample size)

The mode is the value that occurs most frequently. It


represents the highest peak of the distribution. The mode
is a good measure of location when the variable is
inherently categorical or has otherwise been grouped into
categories.
Statistics Associated with Frequency
Distribution
Measures of Location
The median of a sample is the middle value when
the data are arranged in ascending or descending
order. If the number of data points is even, the
median is usually estimated as the midpoint between
the two middle values by adding the two middle
values and dividing their sum by 2. The median is
the 50th percentile.
Statistics Associated with Frequency
Distribution
Measures of Variability

The variance is the mean squared


deviation
n
from 2the mean.
(Xi - X)
sx = S
i =1 n - 1

The standard deviation is the square


root of the variance.
Cross-Tabulation
While a frequency distribution describes one variable
at a time, a cross-tabulation describes two or more
variables simultaneously.
Cross-tabulation results in tables that reflect the joint
distribution of two variables with a limited number of
categories or distinct values.
Since two variables have been cross classified,
percentages could be computed either columnwise,
based on column totals or rowwise, based on row
totals
The general rule is to compute the percentages in the
direction of the independent variable, across the
dependent variable.
Pepsi Consumption by Gender
Gender

Pepsi Consumption Male Female

Light 33.3% 66.7%

Heavy 66.7% 33.3%

Column total 100% 100%


Purchase of Fashion Clothing by Marital
Status
Purchase of Current Marital Status
Fashion
Clothing Married Unmarried
High 31% 52%
Low 69% 48%
Column 100% 100%
Number of 700 300
respondents
Purchase of Fashion Clothing by Marital
Status
Pur chase of Sex
Fashion Male Female
Clothing Marr ied Not Mar r ied Not
Mar r ied Mar r ied
High 35% 40% 25% 60%

Low 65% 60% 75% 40%

Column 100% 100% 100% 100%


totals
Number of 400 120 300 180
cases
Case Cleopatra

Prelaunch Market Research


Objective to assess
response to Cleopatra advt.
Product acceptance
Design
Supergroup
Ad test, Product Placement
Methodology
??
Toronto
Case Cleopatra

Prelaunch Market Research


Results
Positive from the group
50% Buying Intention Post Ad
64% Buying Intention Post Trial
Decision
Launch in Quebec
Premium
Advt. and some consumer promotion
Case Cleopatra

Prelaunch Market Research


Problems
Location
Beyond Trial
Adoption, Purchase Frequency

Poor performance
Sales
Case: Cleopatra

Post Launch Study


204 All Soap Users
99 Cleopatra Users (Try)
Results
High Awareness
73.5%
Low Trials
14%
Case Cleopatra

Trial Implications
Lost Opportunity
73.514.2%
Critical factor
High awareness not enough
Awareness, Interest, Evaluation, Trial, Adoption
Case Cleopatra

Low Trials Reasons


Lack of adequate promotional support
Low redemption of coupons
Sweepstakes did not work at all
Problems with the ad : Exhibit 13
63% do not intend to try
59% no or a negative reaction to the Cleopatra
Why?
Case Cleopatra

Problems with the ad


Jug of perfume being poured
Strong smell a problem
Perceived to be harsh and not for skin care
Footnote Exhibit 11
Execution of bath
Showers outnumber baths 4:1 in Quebec (ex.
12)
Not for everyday usage
67% --Occasional usage (ex. 12)
Case Cleopatra

Decision Options
Discontinue brand
Continue the current strategy
4.5% market share
Smaller niche
Case Cleopatra

Decision Options
Discontinue brand
Subsidiary/Sales force reputation
Externally
Internally
Need a contender for skin care segment
Case Cleopatra

Decision Options
Continue the current strategy
Significantly higher trial levels
Increase in promotions
Increase in expenses
More losses
Case Cleopatra

Brand Performance
High Conversion rate
Strong diagnostics among users
Exhibit 10
Skin care 50%
Fragrance 53%
Case Cleopatra: Exhibit 9

Brand Conversion
rate(all+most
occasions)/ever tried
Aloe and Lonolin 16%
Camay 14%
Cleopatra 31%
Dove 21%
Palmolive 12%
Case Cleopatra

Scale down expectations


Target a smaller segment
Need to profile current acceptors
Need to promote to this group
Change advertising- low/drop.
Reduce distribution coverage
With better incentives
Further Analysis: Crosstabs

Exhibits 9 and 10
Dove Regular vs. Others
Age segments
MHI groups
Problem 0

Pepsi has conducted a pilot U & A study


for its brands. It has found that favourite
brand varies across males and females.
It found that 5/15 males and 10/15
females prefer Mirinda and the reverse
is true for Pepsi. How should Pepsi test
this relationship?
Statistics Associated with Cross-
Tabulation
Chi-Square
To determine whether a systematic association
exists, the probability of obtaining a value of chi-
square as large or larger than the one calculated
from the cross-tabulation is estimated.
An important characteristic of the chi-square statistic
is the number of degrees of freedom (df) associated
with it. That is, df = (r - 1) x (c -1).
The null hypothesis (H0) of no association between
the two variables will be rejected only when the
calculated value of the test statistic is greater than
the critical value of the chi-square distribution with the
appropriate degrees of freedom.
Statistics Associated with Cross-
Tabulation
Chi-Square
The chi-square statistic ( ) is used to test the
statistical significance of the observed association in
a cross-tabulation.
The expected frequency for each cell can be
calculated by using a simple formula:
nrnc
fe = n
where nr = total number in the row
nc = total number in the column
n = total sample size
Statistics Associated with Cross-
Tabulation
Chi-Square
For the data in Table, the expected frequencies
for 15 X 15 = 7.50 15 X 15 = 7.50
30 30
the cells going from left to right and from top to
bottom, are: 15 X 15 = 7.50 15 X 15 = 7.50
30 30

(f o - f e) 2
2 = S fe
all
cells

Then the value of is calculated as follows:


Statistics Associated with Cross-
Tabulation
Chi-Square
For the data in Table, the value of is
calculated as:

= (5 -7.5)2 + (10 - 7.5)2 + (10 - 7.5)2 + (5 - 7.5)2


7.5 7.5 7.5 7.5

=0.833 + 0.833 + 0.833+ 0.833

= 3.333
Marketing Problem 1

Vodafone Mobile has conducted a pilot


customer satisfaction study and it has found
that from a sample of 29 IIM students
average is 4.724 on a 7-point satisfaction
scale with a std. dev. Of 1. 579. Minimum
acceptable value of customer satisfaction
should be greater 4 for the firm. What should
the market research manager recommend to
the marketing manager?
Hypothesis Testing Using the t
Statistic
1. Formulate the null (H0) and the alternative (H1)
hypotheses.
2. Select the appropriate formula for the t statistic.
3. Select a significance level for testing H0. Typically,
the 0.05 level is selected.
4. Take one or two samples and compute the mean
and standard deviation for each sample.
5. Calculate the t statistic assuming H0 is true.
Hypothesis Testing Using the t
Statistic
6. Calculate the degrees of freedom and estimate the
probability of getting a more extreme value of the
statistic from Table 4 (Alternatively, calculate the
critical value of the t statistic).
7. If the probability computed in step 5 is smaller than
the significance level selected in step 2, reject H0.
If the probability is larger, do not reject H0.
8. Express the conclusion reached by the t test in
terms of the marketing research problem.
One Sample
t Test

The hypotheses may be


formulated as:

H0: < 4.0


H1: > 4.0
t = (X - )/sX
sX = s/ n
sX = 1.579/ 29
= 1.579/5.385 = 0.293

t = (4.724-4.0)/0.293 = 0.724/0.293 = 2.471


One Sample
t Test
The degrees of freedom for the t statistic to test the
hypothesis about one mean are n - 1. In this case,
n - 1 = 29 - 1 or 28. From Table 4 in the Statistical
Appendix, the probability of getting a more extreme
value than 2.471 is less than 0.05 (Alternatively, the
critical t value for 28 degrees of freedom and a
significance level of 0.05 is 1.7011, which is less than
the calculated value). Hence, the null hypothesis is
rejected. The satisfaction level does exceed 4.0.
Marketing Problem -2

Levers has launched a new brand of


coffee. It is interested in knowing if
consumers in South and North India are
responding differently to its new
product. What testing procedure do you
recommend to Levers?
Two Independent Samples
Means
In the case of means for two independent samples,
the hypotheses take the following form.
H : 0 1 2

H :
1 1 2

The two populations are sampled and the means and


variances computed based on samples of sizes n1
and n2. If both populations are found to have the
same variance, a pooled variance estimate is
computed from the two sample variances as follows:
n1 n2 2 2
(X X ) + (X X ) or s2 = (n 1 - 1) s1 +(n 2-1) s2
2 2
i1
- i2
-
1 2
i 1 i 1
2
s n1 + n2 -2
n + n -2
1 2
Two Independent Samples
Means

The standard deviation of the test statistic can be


estimated as:

sX 1 - X 2 = s 2 (n1 + n1 )
1 2

The appropriate value of t can be calculated as:

(X 1 -X 2) - (1 - 2)
t= sX 1 - X 2

The degrees of freedom in this case are (n1 + n2 -2).


Two Independent Samples
F Test
An F test of sample variance may be performed
if it is
not known whether the two populations have
equal variance. In this case, the hypotheses
are:
H0: 2 = 2

1
2

H1: 12
2
2
Two Independent Samples
F Statistic
The F statistic is computed from the sample
variances as follows

s12
F(n1-1),(n2-1) =
s22
where
n1 = size of sample 1
n2 = size of sample 2
n1-1 = degrees of freedom for sample 1
n2-1 = degrees of freedom for sample 2
s12 = sample variance for sample 1
s22 = sample variance for sample 2
Marketing Problem -3

Pepsi has launched two new variants of


diet Pepsi. It has decided to conduct a
product test to arrive at a suitable
product. It has decided to conduct a
C.L.T. on a group of consumers. What
testing procedures would you suggest
to Pepsi?
Paired Samples
The difference in these cases is examined by a paired samples t
test. To compute t for paired samples, the paired difference
variable, denoted by D, is formed and its mean and variance
calculated. Then the t statistic is computed. The degrees of
freedom are n - 1, where n is the number of pairs. The relevant
formulas are:
H0: D = 0
H1: D 0
D - D
tn-1 = sD
n
Paired Samples
where, n
S1 Di
D= i=
n
n
S=1 (Di - D)2
sD = i
n-1

S
SD n
D
Marketing Problem -4

Nestle has launched a new variant of


drinking chocolate. It has decided to
conduct a product test to assess
consumer response. It has divided the
country into 4 geographic zones and
would like to know if regional
differences are relevant for this new
launch. What testing procedures would
you suggest to Nestle?
Relationship Among Techniques
Analysis of variance (ANOVA) is used as a test of
means for two or more populations. The null
hypothesis, typically, is that all means are equal.
Analysis of variance must have a dependent variable
that is metric (measured using an interval or ratio
scale).
There must also be one or more independent
variables that are all categorical (nonmetric).
Categorical independent variables are also called
factors.
Decomposition of the Total
Variation:
Independent Variable X
One-way ANOVA Total
Categories Sample
Within X1 X2 X3 Xc
Category Y1 Y1 Y1 Y1 Y1 Total
Variation Variation
Y2 Y2 Y2 Y2 Y2 =SSy
=SSwithin : :
: :
Yn Yn Yn Yn YN
Category Y1 Y2 Y3 Yc Y
Mean
Between Category Variation = SSbetween= SSx
Statistics Associated with One-
way
Analysis of Variance
SSbetween. Also denoted as SSx, this is the variation
in Y related to the variation in the means of the
categories of X. This represents variation between
the categories of X, or the portion of the sum of
squares in Y related to X.

SSwithin. Also referred to as SSerror, this is the


variation in Y due to the variation within each of the
categories of X. This variation is not accounted for
by X.

SSy. This is the total variation in Y.


Conducting One-way Analysis of
Variance
Decompose the Total Variation
The total variation in Y, denoted by SSy, can be
decomposed into two components:

SSy = SSbetween + SSwithin

where the subscripts between and within refer to the


categories of X. SSbetween is the variation in Y related
to the variation in the means of the categories of X.
For this reason, SSbetween is also denoted as SSx.
SSwithin is the variation in Y related to the variation
within each category of X. SSwithin is not accounted
for by X. Therefore it is referred to as SSerror.
Conducting One-way Analysis of
Variance
Test Significance
The null hypothesis may be tested by the F statistic
based on the ratio between these two estimates:
SS x /(c - 1) MS x
F= =
SS error/(N - c) MS error
This statistic follows the F distribution, with (c - 1) and
(N - c) degrees of freedom (df).
Conducting One-way Analysis of
Variance
Interpret the Results
If the null hypothesis of equal category means is not
rejected, then the independent variable does not
have a significant effect on the dependent variable.
On the other hand, if the null hypothesis is rejected,
then the effect of the independent variable is
significant.

Potrebbero piacerti anche