Sei sulla pagina 1di 18

Quality of

Bordeaux
Wines
By Gregory Ward Dittman
Table of Contents
UNL Student Code of Conduct................................................3
Introduction to Analysis:.......................................................4
Linear Regression between Year and Price:............................5
Linear Regression between Price and Summer Temp:..............6
Linear Regression between Wine Price and Harvest Rainfall....7
Linear Regression between Wine Price and September
Temperature:........................................................................ 8
Linear Regression between Wine Price and Winter
Temperature:........................................................................ 9
Linear Regression between Wine Price and Wine Age:...........10
Multiple Linear Regression w/ all coefficients:.......................11
Final Conclusions:...............................................................13
UNL Student Code of Conduct

Data set: winedata.xlsx (given by Professor)

I pledge my honor that I, Greg Dittman, have not violated the UNL
Student Code of Conduct during this assignment.

Signed by,
Greg Dittman.
Introduction to Analysis:
Brief Overview:
Many wines become better with age and their prices increase to
reflect the higher value. Wines from some years mature into very good
even excellent wines while others prove to be mediocre. There are
significant profits to be made from identifying high quality vintages
early when prices are low and before everyone realizes their quality.
For example, early buyers of the 1961 Bordeaux vintage made
substantial profits as 1961 matured into one of the finest vintages in
decades. Many people are willing to buy the young wines from the
1980s before their quality is known, hoping that they will have
matured into classic wines like the 1961 vintage.
Null Hypothesis:
A number of experts predicted that the 1986 vintage would be
among the finest of the decade. The goal of this research paper is to
use the data to predict the relative prices for wines in the 1980s (in
particular, which vintages (1981-1991) are likely to mature into high
quality, high priced wines_, and examine whether your analysis
provides different predictions from the experts.
Analysis Outline:
To analyze the hypothesis that 1986 vintage wine will be among
the finest of the decade, I will use linear regression to compare the
dependent variable, price, to all of the independent variables given
(summer temp, harvest rainfall, September temp, winter rainfall, year,
and age).
Research Goal:
In conclusion, I will determine whether my analysis provides a
different prediction from the experts or fails to reject their hypothesis.
Linear Regression between
Year and Price:
price = 2,302.48867 - 1.15602 * year

Regression Statistics
R 0.4549
R-square 0.20694
Adjusted R-square 0.17521
S 19.03182
N 27

Prices:
1
Minimum
0.
1
Maximum 0
0.
2
8.
8
Mean 1
4
8
1
Scatter Diagram (Predicted Y, price vs. year )

100

80

60
price
40

20

0
1950 1955 1960 1965 1970 1975 1980 1985 1990
year

Interpretation:

The scatter diagram


shows us a negative
linear relationship
between the
dependent variable
(price) and the
independent variable
(year). The line of best
fit is described through
the equation of
2,302.489 1.5602 *
year. The r-squared
value is only .20694
which indicates that
the model explains only
a little of the variability
in response around its
mean. This is also
because an outlier of
$100 occurs in the year
1961.
This independent
variable will not be
used during the
multiple linear
regression test.

Linear
Regressio
n between
Price and
Summer
Temp:
Regression Statistics
R
R-square
Adjusted R-square
S
N

price = - 281.41431 +
18.83559 * sum

Minimum
Maximum 100.
Mean 28.81481

Scatter Diagram (Predicted Y, price vs. sum )

100

80

60
price
40

20

0
14 14.5 15 15.5 16 16.5 17 17.5 18
sum
Interpretation:

The above scatter


diagram shows that the
independent variable
(Summer Temperature)
and the dependent
variable (price) has a
positive linear
relationship with a
linear equation of price
= -281.41 + 18.84 *
summer temp. The R-
square value is only .
34354 but that can be
explained because of
the outlier in 1961
where the wine was
$100.

Linear
Regressio
n between
Wine
Price and
Harvest
Rainfall
Regression Statistics
R
R-square
Adjusted R-square
S
N

price = 47.37198 - 0.12814 * har


Scatter Diagram (Predicted Y, price vs. har )

100

80

60
price
40

20

0
0 50 100 150 200 250 300
har

Interpretation:

The scatter diagram shows a negative linear relationship between the


independent variable (harvest rainfall) and the dependent variable
(price) with the linear equation of price = 47.37 - .128 * harvest
rainfall. The r-squared value is only .19962 and shows that the model
explains only a little of the variability in response around its mean.
This can be explained because of the outlier of $100 caused by the
1961 vintage wine.

Linear Regression between


Wine Price and September
Temperature:
Regression Statistics
R 0.56884
R-square 0.32358
Adjusted R-square 0.29653
S 17.57657
N 27
Scatter Diagram (Predicted Y, price vs. sep )

100

80

60
price
40

20

0
14 15 16 17 18 19 20 21
sep

price = - 116.13707 + 8.5025 * sep

Interpretation:

The scatter diagram shows a negative linear relationship between the


independent variable (September temperature) and the dependent
variable (price) with the linear equation of price = -116.14 + 8.5025 *
September Temperature. The r-squared value is .32358 and shows
that the model explains only a little of the variability in response
around its mean. This can be explained because of the outlier of $100
caused by the 1961 vintage wine.
Linear Regression between
Wine Price and Winter
Temperature:
Regression Statistics
R 0.23121
R-square 0.05346
Adjusted R-square 0.01559
S 20.79203
N 27
Scatter Diagram (Predicted Y, price vs. win )

100

80

60
price
40

20

0
300 400 500 600 700 800 900
win

price = 5.96965 + 0.03755 * win

Interpretation:

The scatter diagram shows a slightly positive linear relationship


between the independent variable (Winter rainfall) and the dependent
variable (price) with the linear equation of price = 5.97 + .03755 *
winter rainfall. The r-squared value is only .05346 and shows that the
model does not explain variability in response around its mean. It is
safe to say that there is little to no correlation between the amount of
rainfall in the winter and the wine price.

This independent variable will not be used in the multiple


linear regression equation.
Linear Regression between
Wine Price and Wine Age:
Regression Statistics
R 0.4549
R-square 0.20694
Adjusted R-square 0.17521
S 19.03182
N 27

price = - 0.29972 + 1.15602 * age

Scatter Diagram (Predicted Y, price vs. age )

100

80

60
price
40

20

0
10 15 20 25 30 35 40 45 50
age

Interpretation:

The scatter diagram shows a positive linear relationship between the


independent variable (wine age) and the dependent variable (price)
with the linear equation of price =-.29972 + 1.156 * age. The r-
squared value is .20694 and shows that the model explains only a little
of the variability in response around its mean. This can be explained
because of the outlier of $100 caused by the 1961 vintage wine.
Multiple Linear Regression w/
all coefficients:
Regression Statistics
R 0.82119
R-square 0.67435
Adjusted R-square 0.61514
S 13.00051
N 27

price = - 234.92138 + 0.96927 * age + 11.32997 * sum - 0.09384 *


har + 3.88931 * sep
Residuals vs Predicted Y
40
30
20
10
Residuals 0
-10
-20
-30
-10 0 10 20 30 40 50 60 70
Predicted Y

Residuals
40
30
20
10
Residuals
0
-10 1 2 3 4 5 6 7 8 9 101112131415161718192021222324252627
-20
-30
Observation
Interpretation:

By using all of the independent variables, the multiple linear equation


is

y = - 234.92138 + 0.96927(X1) + 11.32997 - 0.09384(X2)+


3.88931(X3)
where;
y= price
X1= summer temp.
X2= harvest rainfall
X3= September temperature

By using this equation the coefficient of multiple determination for


multiple regression (or the R squared value) is .67435. This value is
high enough to assume that model explains most of the variability of
the response data around the mean. By inputting data, you can be
confident that your answer will be within ample variance from the
multiple linear equation.
Final Conclusions:
Estimated Multiple Linear Regression Equation
price = - 234.92138 + 0.96927 * age + 11.32997 * sum - 0.09384 *
har + 3.88931 * sep

Expected Prices in accordance to above equation:

1980: $22.61
1981: $27.94
1982: $28.66
1983: $29.40
1984: $10.84
1985: $32.15
1986: $7.59 Predicted Finest of the Decade
1987: $25.25
1988: $22.50
1989: $42.60
1990: $46.44 Actual Finest of the Decade
1991: $28.76

Interpretation:

Predicted cost of the 1986 vintage wine is only $7.59. $7.59 is the
minimum cost in the 1980-1991 range.

The experts are incorrect with the assumption that the 1986 vintage
would be among the finest of the decade. In fact, it is actually the
cheapest of the decade.

According to the multiple linear regression equation, 1990 vintage wine


would be the finest of the decade.

Potrebbero piacerti anche