Sei sulla pagina 1di 11

Urooj Qaiser

13U00324

Case 1

10)

Scatterplot of Clients vs Stamps


200

175

150
Clients

125

100

75

50
25000 27500 30000 32500 35000
Stamps

Th plot above displays that there is a somewhat moderate positive relationship between the two
variables.

Scatterplot of Clients vs Index


200

175

150
Clients

125

100

75

50
100 105 110 115 120 125
Index
Urooj Qaiser
13U00324

Th plot above displays that there is a somewhat moderate positive relationship between the two
variables.

11)
Pearson correlation of Clients and Index = 0.604
P-Value = 0.000

There is a significant relationship between number of clients and index as the p value is less than
0.05. The relationship is moderately positive as depicted by the Pearson correlation which shows a
value of 0.604.

Pearson correlation of Clients and Stamps = 0.431


P-Value = 0.002

There is a significant relationship between number of clients and stamps as the p value is less than
0.05. The relationship is moderately positive as depicted by the Pearson correlation which shows a
value of 0.431.

12)

Regression (clients and index)

The normal probability plot displays that errors are normally distributed since blue points lie on the
red line. This result is also reinforced by the shape of the histogram. The versus fit diagram depicts
Urooj Qaiser
13U00324

that there is a problem of heteroscedasticity since the errors points are not randomly distributed
and are clustered. The versus order graph explains that since errors are evenly distributed around
the zero line so the errors are independently distributed. The Durbin Watson statistic of 1.75062 is
also close to 2 indicating that there is no problem of serial correlation.

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 1 11561 11560.7 26.43 0.000
Error 46 20123 437.4
Total 47 31683

Ho: B1= 0

H1: At least one of the regression predictors is significant

Since the p value is less than 0.05 so we fail to accept H0 and conclude that at least one of the
regression coefficient is significant. We can further explore it by the individual t-test of regression
coefficients.

Regression Equation

Clients = -105.3 + 2.131 Index

Term Coef SE Coef T-Value P-Value VIF


Constant -105.3 47.0 -2.24 0.030
Index 2.131 0.414 5.14 0.000 1.00

S R-sq R-sq(adj)
20.9152 36.49% 35.11%

The coefficient index is a significant predictor of the number of clients seen, since the p value is less
than 0.05. There is also no problem of multicollinearity since value of VIF is less than 10.

The index explains 36.49 percent of variation of the number of new clients seen.

FORECAST

1st month 1993


Clients = -105.3 + 2.131 (125)= 161.075

2nd month 1993


Clients = -105.3 + 2.131 (125)= 161.075
Urooj Qaiser
13U00324

3rd month 1993


Clients = -105.3 + 2.131 (130)= 171.73

Regression (clients and stamps)

The normal probability plot displays that errors are normally distributed since blue points lie on the
red line. This result is also reinforced by the shape of the histogram. The versus fit diagram depicts
that there is no problem of heteroscedasticity since the errors points are randomly distributed and
are not clustered. The versus order graph explains that since errors are evenly distributed around
the zero line so the errors are independently distributed. The Durbin Watson statistic of 1.84859 is
also close to 2 indicating that there is no problem of serial correlation

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 1 5892 5891.9 10.51 0.002
Error 46 25791 560.7
Total 47 31683

Ho: B1= 0

H1: At least one of the regression predictors is significant


Urooj Qaiser
13U00324

Since the p value is less than 0.05 so we fail to accept H0 and conclude that at least one of the
regression coefficient is significant. We can further explore it by the individual t-test of regression
coefficients.

Regression Equation

Clients = 32.7 + 0.00349 Stamps

Term Coef SE Coef T-Value P-Value VIF


Constant 32.7 31.9 1.02 0.312
Stamps 0.00349 0.00108 3.24 0.002 1.00

S R-sq R-sq(adj) R-sq(pred)


23.6787 18.60% 16.83% 11.28%

The coefficient stamps is a significant predictor of the number of clients seen, since the p value is
less than 0.05. There is also no problem of multicollinearity since value of VIF is less than 10.

The stamps explain 18.60 percent of variation of the number of new clients seen.

14)

Actual Values Forecasted Values


152 161.075
151 161.075
199 171.73

There is a lot of difference between the actual and forecasted values.

15) Business activity index is a good predictor of number of clients since it explains 36. 49
percent of variation of the number of new clients seen. Whereas, stamps only explain 18.60 percent
of variation of the number of new clients seen.

Q16)

Analysis of Variance

Source DF SS MS F P
Regression 4 17435.7 4358.9 13.16 0.000
Residual Error 43 14247.5 331.3
Total 47 31683.2
Urooj Qaiser
13U00324

Ho: B1=B2=B3=B4=0

H1: At least one of the regression predictors is significant

Since the p value is less than 0.05 so we fail to accept H0 and conclude that at least one of the
regression coefficient is significant. We can further explore it by the individual t-test of regression
coefficients.

Predictor Coef SE Coef T P VIF


Constant -291.03 60.16 -4.84 0.000
Stamps -0.001590 0.002561 -0.62 0.538 9.587
Index 3.6976 0.9303 3.97 0.000 6.652
Bankruptcies 0.4693 0.1605 2.92 0.005 1.486
Permits -0.08475 0.04517 -1.88 0.067 2.090

S = 18.2027 R-Sq = 55.0% R-Sq(adj) = 50.8%

Since the variables stamps and permits do not impact the number of clients seen so we drop these
coefficients and re run the model.

Residual Plots for Clients


Normal Probability Plot Versus Fits
99
40
90
20
Residual
Percent

50 0

10 -20

1 -40
-50 -25 0 25 50 100 120 140 160 180
Residual Fitted Value

Histogram Versus Order


40
12

20
Frequency

9
Residual

6 0

3 -20

0 -40
-40 -20 0 20 40 1 5 10 15 20 25 30 35 40 45
Residual Observation Order
Urooj Qaiser
13U00324

The normal probability plot displays that errors are normally distributed since red points lie on the
blue line. This result is also reinforced by the shape of the histogram. The versus fit diagram depicts
that there is no problem of heteroscedasticity since the errors points are randomly distributed and
are not clustered. The versus order graph explains that since errors are evenly distributed around
the zero line so the errors are independently distributed. The Durbin Watson statistic is also close to
2 indicating that there is no problem of serial correlation.

Analysis of Variance

Source DF SS MS F P
Regression 2 15123.7 7561.8 20.55 0.000
Residual Error 45 16559.6 368.0
Total 47 31683.2

Ho: B1=B2= 0

H1: At least one of the regression predictors is significant

Since the p value is less than 0.05 so we fail to accept H0 and conclude that at least one of the
regression coefficient is significant. We can further explore it by the individual t-test of regression
coefficients.

The regression equation of the final model is


Clients = - 217 + 2.49 Index + 0.452 Bankruptcies

Predictor Coef SE Coef T P VIF


Constant -216.74 56.02 -3.87 0.000
Index 2.4931 0.3976 6.27 0.000 1.094
Bankruptcies 0.4517 0.1451 3.11 0.003 1.094

S = 19.1831 R-Sq = 47.7% R-Sq(adj) = 45.4%

Both the coefficient index and bankruptcies are significant predictors of the number of clients seen,
since the p value is less than 0.05. There is also no problem of multicollinearity since value of VIF for
Urooj Qaiser
13U00324

both variables is less than 10. However the BAI is a better significant predictor of number of clients
seen compared to bankruptcies.

The two significant predictors explain 45.4 percent of variation of the number of new clients seen.

Q17)

Analysis of Variance

Source DF SS MS F P
Regression 12 61203.6 5100.3 16.10 0.000
Residual Error 86 27249.2 316.9
Total 98 88452.7

Ho: B1=B2=B3=B4=B5=B6=B7=B8=B9=B10=B11=B12=0

H1: At least one of the regression predictors is significant

Since the p value is less than 0.05 so we fail to accept H0 and conclude that at least one of the
regression coefficient is significant. We can further explore it by the individual t-test of regression
coefficients.

The regression equation is


Clients = - 187 + 2.68 Index + 34.9 S1 + 27.2 S2 + 34.9 S3 + 13.5 S4 + 8.78 S5
+ 14.5 S6 + 11.1 S7 + 10.8 S8 + 6.37 S9 + 22.7 S10 + 12.0 S11

Predictor Coef SE Coef T P VIF


Constant -186.75 26.43 -7.07 0.000
Index 2.6763 0.2391 11.19 0.000 1.043
S1 34.863 8.683 4.02 0.000 1.947
S2 27.193 8.690 3.13 0.002 1.950
S3 34.915 8.745 3.99 0.000 1.975
S4 13.526 8.909 1.52 0.133 1.842
S5 8.783 8.903 0.99 0.327 1.839
S6 14.489 8.904 1.63 0.107 1.840
S7 11.121 8.901 1.25 0.215 1.839
S8 10.783 8.903 1.21 0.229 1.839
S9 6.371 8.901 0.72 0.476 1.839
Urooj Qaiser
13U00324
S10 22.695 8.906 2.55 0.013 1.841
S11 12.030 8.905 1.35 0.180 1.840

S = 17.8003 R-Sq = 69.2% R-Sq(adj) = 64.9%

The variables S4, S5,S6,S7,S8,S9,S11 are insignificant predictors of the dependent


variable, so we drop these variables and re run the regression.

Residual Plots for Clients


Normal Probability Plot Versus Fits
99.9
40
99

90 20
Residual
Percent

50 0

10
-20
1
0.1 -40
-50 -25 0 25 50 100 125 150 175 200
Residual Fitted Value

Histogram Versus Order


16 40

12 20
Frequency

Residual

8 0

4 -20

0 -40
-30 -20 -10 0 10 20 30 40 1 10 20 30 40 50 60 70 80 90
Residual Observation Order

The normal probability plot displays that errors are normally distributed since red points lie on the
blue line. This result is also reinforced by the shape of the histogram. The versus fit diagram depicts
that there is no problem of heteroscedasticity since the errors points are randomly distributed and
are not clustered. The versus order graph explains that since errors are evenly distributed around
the zero line so the errors are independently distributed. The Durbin Watson statistic is also close to
2 indicating that there is no problem of serial correlation.

Analysis of Variance

Source DF SS MS F P
Regression 5 59989 11998 39.20 0.000
Urooj Qaiser
13U00324
Residual Error 93 28464 306
Total 98 88453

Ho: B1=B2=B3=B4=B5= 0

H1: At least one of the regression predictors is significant

Since the p value is less than 0.05 so we fail to accept H0 and conclude that at least one of the
regression coefficient is significant. We can further explore it by the individual t-test of regression
coefficients.
The final regression equation is
Clients = - 179 + 2.70 Index + 25.2 S1 + 17.5 S2 + 25.2 S3 + 13.0 S10

Predictor Coef SE Coef T P VIF


Constant -179.34 25.48 -7.04 0.000
Index 2.6970 0.2346 11.49 0.000 1.040
S1 25.176 6.253 4.03 0.000 1.045
S2 17.499 6.260 2.80 0.006 1.048
S3 25.183 6.321 3.98 0.000 1.068
S10 13.045 6.562 1.99 0.050 1.035

S = 17.4948 R-Sq = 67.8% R-Sq(adj) = 66.1%

The coefficient Index, S1, S2, S3 and S10 are significant predictors of the number of clients seen,
since the p value is less than 0.05. There is also no problem of multicollinearity since value of VIF for
all variables is less than 10. However, Index, S1 and S4 are better significant predictor of number of
clients seen compared to bankruptcies.

The significant predictors explain 66.1 percent of variation of the number of new clients seen. This
indicates that seasonal variation does exist and explains the variation of the dependent variable.
Urooj Qaiser
13U00324

18)

Clients = - 179 + 2.70 Index + 25.2 S1 + 17.5 S2 + 25.2 S3 + 13.0 S10

Forecast 1st month 1993

Clients = - 179 + 2.70(125) + 25.2 (1) + 17.5(0) + 25.2(0) + 13.0 (0)


Clients=183.7

Forecast 2nd month 1993

Clients = - 179 + 2.70(125) + 25.2 (0) + 17.5(1) + 25.2(0) + 13.0 (0)


Clients=176

Forecast 3rd month 1993

Clients = - 179 + 2.70(130) + 25.2 (0) + 17.5(0) + 25.2(1) + 13.0 (0)


Clients= 197.2

Since the actual values are very different from the forecasted values the model is not very accurate.

Potrebbero piacerti anche