Sei sulla pagina 1di 10

Ice Cream Consumption

Introduction:
in the market.

Ice cream is one of the major frozen desserts


Many people enjoy it while watching their favor

television programs or after dinner.

Yet, what are the key

factors that affect the people consuming ice cream?


paper we will investigate the potential factors.

In this

The data were

collected from March 18, 1951 to July 11, 1953, total of 30 fourweek periods.

The potential variables are: price of the ice

cream (Price), weekly family income of the consumers (Income),


and the temperature (Temp).

Methodology:

Since we are interested what are the key factors

that affect the ice cream consumption, we will conduct certain


tests to identify and examine every potential variable.

In view

of the fact that the data were collected over time, we will
conduct a time-plot to reveal the relationship between ice cream
consumption and time.

Then, we will use the box-plot to identify

the outlier samples.

Next, we will use backward selection method

to remove the irrelevant variable.

Finally, we will run the

regression model and so, we can obtain the model to estimate the
ice cream consumption.

Analysis:

Before conducting any statistical testing, we found

the ice cream consumption data is a time-series data.

The

samples were collected every 4 weeks for 30 consecutive trials.


As a result, we have to conduct a time-plot to examine the
presence of any patterns over the observation period.

At the

same time, we will conduct the regression model of ice cream


consumption (IC) vs. Date.

Figure 1:

Regression Model of IC vs. Date


IC = 0.3337 +0.0017
0.55

Date
N
30
Rsq
0.0492
AdjRsq
0.0152

0.50

RMSE
0.0653
0.45

0.40

0.35

0.30

0.25
0

10

15

20

25

30

Date

By observation, the time-plot displays a noticeable pattern.


samples are bouncing up and down in time.

The

We are positive it was

affected by the seasonal factor because the ice cream consumption


was higher during summer and lower in the winter.

Additionally,

the regression line shows an increasing trend over the


observation period.

As a result, we believe the time-series

factor Date may explain some of the change in the means of ice
cream consumption.

Since the time-series factor takes affect to the model, we will


sort out the outliers by Year.

Hence, we will conduct 3

separate box-plot graph as of Income vs. Year, Price vs. Year and
Temp vs. Year.
Box-plot 1 (Income vs. Year):

Box-plot 2 (Price vs. Year):


0.30

100

95

0.29

90
I
n
c
o
m
e

P
r
i
c
e

0.28

85

0.27
80

0.26

75
1

Year

Year

Box-plot 3 (Temp vs. Year):


80

70

60

T
e
m
p

50

40

30

20
1

Year

According to the box-plot graphs at above, there are no outliers


among all 3 potential variables.

Therefore, we will move on to

the backward selection method to screen out the needless


variable.

The purpose of this study is in an attempt to determine the main


factors for ice cream consumption.

Therefore, we will run

backward selection to eliminate the ineffective variables among


Price, Income, and Temp which in order to obtain a better
model of ice cream consumption.

In addition,

Backward Elimination: Step 0


All Variables Entered: R-Square = 0.7190 and C(p) = 4.0000

Source

DF

Analysis of Variance
Sum of
Squares

Model
Error
Corrected Total

3
26
29

0.09025
0.03527
0.12552

Variable
Intercept
Price
Income
Temp

Mean
Square

F Value

Pr > F

22.17

<.0001

0.03008
0.00136

Parameter
Estimate

Standard
Error

Type II SS

F Value

Pr > F

0.19732
-1.04441
0.00331
0.00346

0.27022
0.83436
0.00117
0.00044555

0.00072338
0.00213
0.01082
0.08174

0.53
1.57
7.97
60.25

0.4718
0.2218
0.0090
<.0001

Bounds on condition number: 1.1444, 9.9727


------------------------------------------------------------------------------------------------Backward Elimination: Step 1
Variable Price Removed: R-Square = 0.7021 and C(p) = 3.5669
Analysis of Variance
Source

DF

Sum of
Squares

Mean
Square

Model
Error
Corrected Total

2
27
29

0.08812
0.03740
0.12552

0.04406
0.00139

Variable
Intercept
Income
Temp

F Value

Pr > F

31.81

<.0001

Parameter
Estimate

Standard
Error

Type II SS

F Value

Pr > F

-0.11320
0.00353
0.00354

0.10828
0.00117
0.00044496

0.00151
0.01261
0.08784

1.09
9.10
63.41

0.3051
0.0055
<.0001

Bounds on condition number: 1.1179, 4.4715


------------------------------------------------------------------------------------------------All variables left in the model are significant at the 0.0500 level.
Summary of Backward Elimination
Step
1

Variable
Removed
Price

Number
Vars In
2

Partial
R-Square

Model
R-Square

0.0169

0.7021

C(p)
3.5669

F Value

Pr > F

1.57

0.2218

By the result of the backward selection, the variable Price has


been removed.

The R2 was dropped to 0.7021, from 0.7190.

Additionally, the F-value was increased to 31.81, from 22.17.


These information states the newest model has less variability
and higher significant value.

Therefore, it yields a better

model to predict the ice cream consumption.


Predicting Ice Cream Consumption by Temperature and Income
The REG Procedure
Dependent Variable: IC
Analysis of Variance
Source

DF

Sum of
Squares

Mean
Square

Model
Error
Corrected Total

2
27
29

0.08812
0.03740
0.12552

0.04406
0.00139

Root MSE
Dependent Mean
Coeff Var

0.03722
0.35943
10.35446

R-Square
Adj R-Sq

F Value

Pr > F

31.81

<.0001

0.7021
0.6800

Parameter Estimates
Variable
Intercept
Temp
Income

DF

Parameter
Estimate

Standard
Error

t Value

Pr > |t|

1
1
1

-0.11320
0.00354
0.00353

0.10828
0.00044496
0.00117

-1.05
7.96
3.02

0.3051
<.0001
0.0055

According to the regression procedure at above, we obtain the


following regression model to estimate the Ice Cream consumption:
IC = -0.11320 + 0.00354 * Temp + 0.00353 * Income

The following figure is the residual plot for the regression


model.

We can observe a noticeable pattern that it has an

increasing trend.
Figure 2:

Residual plot for IC by Temp and Income


IC = -0.1132 +0.0035
0.100

Temp +0.0035

Income
N
30
Rsq
0.7021
AdjRsq
0.6800
RMSE
0.0372

0.075

0.050

0.025

0.000

-0.025

-0.050

-0.075
-3

-2

-1

Normal Quantile

In the regression model, we notice the variable Temp is


depending on the change of season.

In order to provide concrete

evidence that the time-series factor Date contribute an


increasing characteristic to the ice cream consumption, we will
conduct the regression model of IC vs. Temp for each Year.
Afterward, we will superimpose these 3 regression lines on the
same plot, and hopefully this overlay plot may express some
hidden facts behind the time-series factor.

Figure 3:

Overlay regression lines of IC vs. Temp by Year


0.55
0.54
0.53
0.52
0.51
0.50
0.49
0.48
0.47
0.46
0.45
0.44
0.43
0.42
0.41
0.40
0.39
0.38
0.37
0.36
0.35
0.34
0.33
0.32
0.31
0.30
0.29
0.28
0.27
0.26
0.25

2
2

3
1

1
1
3

1
2

3
1

2
2

2
2

3
2
2

FIT

PLOT

1 IC 1
2 IC 2
3 IC 3
Year 1951
Year 1952
Year 1953

1
1
20

30

40

50
Temp1

60

70

80

Conclusion: Figure 3 expresses a clear fact that if the weather


is searing and hot, then the ice cream consumption will increase;
and on freezing and chilly days, the ice cream consumption will
be less.

Moreover, the ice cream consumption increased year

after year since March 18, 1951, at least for the 3 years of the
study as indicated by figure 3.

Although our sample is

relatively small, we believe this is not coincident because the


ice cream industry is prosperity in this day and age.

For

instance, Dreyer's Grand Ice Cream Company has new ice cream
flavor every year and had more than a billion dollars in annual
revenue (Reference 2).

As a result, the history did verify our

testing result.

Summary: The main factor that influences the Ice Cream


consumption is the temperature.

We may expect that the selling

of ice cream is higher in the summer and lower in the winter.


The demand for ice cream increases every year.

Reference:
1. The Data and Story Library, Cornell University, NY
http://lib.stat.cmu.edu/DASL/Datafiles/IceCream.html
2. Dreyers Grand Ice Cream Holdings, Inc.
http://www.dreyersinc.com/about/index.asp
Appendix 1 (Codebook):
1. Date: Time period (1-30) of the study (from 3/18/51 to
7/11/53)
2. IC: Ice cream consumption in pints per capita
3. Price: Price of ice cream per pint in dollars
4. Income: Weekly family income in dollars
5. Temp: Mean temperature in degrees Fahrenheit (o F)
6. Year: Year within the study (0 = 1951, 1 = 1952, 2 = 1953)

Appendix 2 (Data):
Obs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

Date
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

Appendix 3 (SAS code):


data ice_cream;
Input Date IC Price
Datalines;
1
.386 .270 78
2
.374 .282 79
3
.393 .277 81
4
.425 .280 80
5
.406 .272 76
6
.344 .262 78
7
.327 .275 82
8
.288 .267 79
9
.269 .265 76
10
.256 .277 79
11
.286 .282 82
12
.298 .270 85
13
.329 .272 86
14
.318 .287 83
15
.381 .277 84
16
.381 .287 82
17
.470 .280 80
18
.443 .277 78
19
.386 .277 84
20
.342 .277 86
21
.319 .292 85
22
.307 .287 87
23
.284 .277 94
24
.326 .285 92
25
.309 .282 95
26
.359 .265 96
27
.376 .265 94
28
.416 .265 96
29
.437 .268 91
30
.548 .260 90
;

Income Temp Year;


41
56
63
68
69
65
61
47
32
24
28
26
32
40
55
63
72
72
67
60
44
40
32
27
28
33
41
52
64
71

1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3

IC
0.386
0.374
0.393
0.425
0.406
0.344
0.327
0.288
0.269
0.256
0.286
0.298
0.329
0.318
0.381
0.381
0.470
0.443
0.386
0.342
0.319
0.307
0.284
0.326
0.309
0.359
0.376
0.416
0.437
0.548

Price

Income

Temp

Year

0.270
0.282
0.277
0.280
0.272
0.262
0.275
0.267
0.265
0.277
0.282
0.270
0.272
0.287
0.277
0.287
0.280
0.277
0.277
0.277
0.292
0.287
0.277
0.285
0.282
0.265
0.265
0.265
0.268
0.260

78
79
81
80
76
78
82
79
76
79
82
85
86
83
84
82
80
78
84
86
85
87
94
92
95
96
94
96
91
90

41
56
63
68
69
65
61
47
32
24
28
26
32
40
55
63
72
72
67
60
44
40
32
27
28
33
41
52
64
71

1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3

proc print data = ice_cream;


title 'Data for Ice Cream Consumption'; run;quit;
proc boxplot;
title 'Boxplot for Income vs. Year';
plot Income*Year;
run;
proc boxplot;
title 'Boxplot for Price vs. Year';
plot Price*Year;
run;
proc boxplot;
title 'Boxplot for Temp vs. Year';
plot Temp*Year;
run;
proc reg data = ice_cream;
title 'Regression Model of IC vs. Date';
model IC = Date;
plot IC * Date;
run;quit;
proc reg data = ice_cream;
model IC = Price Income Temp / selection = backward sls = .05 cp mse;
run;quit;
proc reg data = ice_cream;
title 'Predicting Ice Cream Consumption by Temperature and Income';
model IC = Temp Income; run;
title 'Residual plot for IC by Temp and Income';
plot residual.*nqq.;
run;quit;
proc sort data=ice_cream; by year;
proc reg data=ice_cream; by year;
title 'Predicting Ice Cream Consumption from Temperature by Year';
model IC = Temp;
output out=resids p=Fitted_IC;
proc print data=resids;
run; quit;
proc sort data=resids; by year Temp;
data resids;
set resids;
if year=1 then do; IC1=IC; Temp1=Temp; Fit1=Fitted_IC; end;
if year=2 then do; IC2=IC; Temp2=Temp; Fit2=Fitted_IC; end;
if year=3 then do; IC3=IC; Temp3=Temp; Fit3=Fitted_IC; end;
proc sort data=resids; by year Temp;
proc print data=resids;
run;quit;
symbol1
symbol2
symbol3
symbol4
symbol5
symbol6

cv=red
cv=blue
cv=black
cv=red
cv=blue
cv=black

value='1' i=none;
value='2' i=none;
value='3' i=none;
value=none i=join ci=red
line=1;
value=none i=join ci=blue line=2;
value=none i=join ci=black line=3;

proc gplot data=resids;


title 'Overlay regression lines of IC vs. Temp by Year';
plot IC1*Temp1=1 IC2*Temp2=2 IC3*Temp3=3
Fit1*temp1=4 fit2*temp2=5 fit3*temp3=6/overlay legend;
run;quit;

Potrebbero piacerti anche