Sei sulla pagina 1di 21

Modelling Relationships with

Regression

Debanjan Mitra

Indian Institute of Management Udaipur

November 3, 2020

Debanjan Mitra Indian Institute of Management Udaipur


Modelling Relationships with Regression
Data: Advertising Expenditure

Data Source: James, Witten, Hastie, and Tibshirani (2017)


Data on:
Sales (in thousands on units) for a particular product
Advertising budget (in thousands of dollars) for TV, radio, and
newspaper meda
Data from 200 markets

Debanjan Mitra Indian Institute of Management Udaipur


Modelling Relationships with Regression
Data: Advertising Expenditure

Table: Advertising Expenditure in Different Media and Sales

Serial No. TV Radio Newspaper Sales


1 230.1 37.8 69.2 22.1
2 44.5 39.3 45.1 10.4
3 17.2 45.9 69.3 9.3
4 151.5 41.3 58.5 18.5
5 180.8 10.8 58.4 12.9
. . . . .
. . . . .
. . . . .
. . . . .

Debanjan Mitra Indian Institute of Management Udaipur


Modelling Relationships with Regression
Questions of Interest

Is there a relationship between advertising budget in TV and sales?


How strong is the relationship between advertising budget in TV and
sales?
Which media contribute to sales?
Can we estimate the effect of each medium on sales?

Debanjan Mitra Indian Institute of Management Udaipur


Modelling Relationships with Regression
Questions of Interest

Which media generate the biggest boost in sales?


How much increase in sales is associated with a given increase in TV
expenditure?
How can we predict future sales based on advertising budget in
different media?

Debanjan Mitra Indian Institute of Management Udaipur


Modelling Relationships with Regression
Our Goal

To understand the relationship of sales and advertising expenditure


in different media
And in particular, to predict sales, based on advretising expenditure
in all or some of the media
Note that, here Sales is the response, and advertising expenditure in
TV, radio, and newpaper media are predictors

Debanjan Mitra Indian Institute of Management Udaipur


Modelling Relationships with Regression
Exploration Begins With a Scatterplot
Scatterplot Matrix
0 10 20 30 40 50 5 10 15 20 25

250
TV

0 100
40

Radio
20
0

80
Newspaper

40
0
25
15

Sales
5

0 50 150 250 0 20 40 60 80

Debanjan Mitra Indian Institute of Management Udaipur


Modelling Relationships with Regression
A More Detailed Look
TV Radio Newspaper Sales
0.004
0.003
Corr: Corr: Corr:

TV
0.002
0.055 0.057 0.782***
0.001
0.000
50
40
Corr: Corr:

Radio
30
20 0.354*** 0.576***
10
0

Newspaper
90
Corr:
60
0.228**
30
0

20

Sales
10

Debanjan Mitra 0 100 200 300 0 10 20 30 40 50 0 30 60 90Indian Institute10 20


of Management Udaipur
Modelling Relationships with Regression
Our Aprroach to Exploration

We shall fit
Simple regression models with TV, radio, and newspaper advertising
expenditures as individual predictors
A multiple regression model with all three media expenditures as
predictors

Debanjan Mitra Indian Institute of Management Udaipur


Modelling Relationships with Regression
Predictor: TV

Debanjan Mitra Indian Institute of Management Udaipur


Modelling Relationships with Regression
Predictor: Radio

Debanjan Mitra Indian Institute of Management Udaipur


Modelling Relationships with Regression
Predictor: Newspaper

Debanjan Mitra Indian Institute of Management Udaipur


Modelling Relationships with Regression
Predictors: TV, Radio, and Newspaper

Debanjan Mitra Indian Institute of Management Udaipur


Modelling Relationships with Regression
Significance of a Predictor

How to know whether a pedictor is indeed useful or not?


In the Spreadsheet output, we have a quantity called p-value
The p-value indicates whether a predictor is significant for the
regression model or not
If the p-value is small, the predictor is useful
If the p-value is large, the predictor is not useful
We compare the p-value with a cut-off value (usually 0.05), and
decide whether it is small or large

Debanjan Mitra Indian Institute of Management Udaipur


Modelling Relationships with Regression
Example: Significance of a Predictor

In the combined model, the p-values corresponding to TV, Radio,


and Newspaper advertising expenditures are 0, 0, and 0.86,
respectively
Useful or significant predictors: TV, Radio advertising expenditures
Insignificant predictor: Newspaper advertising expenditure!
A curious observation:
From the individual models, the p-values for TV, Radio, and
Newspaper advertising expenditures are 0, 0, and 0.001, respectively
This implies, individually, all predictors are significant!
Conflict? What is happening really?

Debanjan Mitra Indian Institute of Management Udaipur


Modelling Relationships with Regression
Newspaper: Significant or Not?
TV Radio Newspaper Sales
0.004
0.003
Corr: Corr: Corr:

TV
0.002
0.055 0.057 0.782***
0.001
0.000
50
40
Corr: Corr:

Radio
30
20 0.354*** 0.576***
10
0

Newspaper
90
Corr:
60
0.228**
30
0

20

Sales
10

Debanjan Mitra 0 100 200 300 0 10 20 30 40 50 0 30 60 90Indian Institute10 20


of Management Udaipur
Modelling Relationships with Regression
Newspaper: Significant or Not?

The correlation between radio and newspaper is 0.35


Implication: tendency to spend more on newspaper in markets where
more is spent on radio
Explanation:
Newspaper advertising has no direct impact on sales
Radio advertising does increase sales
Then, in markets where we spend more on radio our sales will tend
to be higher
We also spend more on newspaper in those same markets
Thus, in a simple linear regression with sales and newspaper,
newspaper seems to be signficant
Lurking variable: Radio!
This explanation is supported by the results of the individual and full
regression!

Debanjan Mitra Indian Institute of Management Udaipur


Modelling Relationships with Regression
Is The Regression Model Useful?

Is there a relationship between advertising budget and sales? How


strong is the relationship?
Coefficient of Determination, R 2 , is 89.7%
Implication: This model explains about 90% of the total variation in
sales
Useful, for sure! And quite a strong relationshipt between predictors
and response!

Debanjan Mitra Indian Institute of Management Udaipur


Modelling Relationships with Regression
Which media contribute to sales?

Which media contribute to sales?


The p-values of TV and Radio are close to zero (tiny)
Newspaper has a high p-value
TV and Radio significant, and Newspaper is not

Debanjan Mitra Indian Institute of Management Udaipur


Modelling Relationships with Regression
Can we estimate the effect of each medium
on sales?

How much increase in sales is associated with a given increase in


expenditures in these media?
The coefficient of TV is 0.046
The coefficient of Radio is 0.189
Radio has a larger effect!

Debanjan Mitra Indian Institute of Management Udaipur


Modelling Relationships with Regression
How can we predict sales using this model?

Suppose advertising expenditure in TV, Radio, and Newspaper are


214.7, 24, 4 thousand dollars, respectively
The predicted sales from the model would be

Predicted Sales = 2.939 + 0.046 ∗ 214.7 + 0.189 ∗ 24 − 0.001 ∗ 4


= 17.2851

How close is the prediction?


“Standard Error” in the regression output gives an idea about
closeness of prediction
The actual sales in each market deviate from the predicted sales by
1686 units, on average

Debanjan Mitra Indian Institute of Management Udaipur


Modelling Relationships with Regression

Potrebbero piacerti anche