Sei sulla pagina 1di 9

North South University

Course : BUS 173


Sec : 06

Assigned By:
PhD, Professor
Dept. of BBA, North South University

Applied Statistics Project

Prepared By:
Letter of Transmittal

24th December, 2018.

M.A. Matin

PhD, Professor

Department of Management

School of Business and Economics

North South University

Dear sir,

We are pleased to submit the report on data analysis of applied statistics course.

We have tried our level best to come up with the expectation of you, therefore we pray and hope
that you would consider the mistakes if it has.

We will be very grateful, if you kindly accept this report.

Sincerely,

Salina Akter
Rifat Mahmud
Sadia Afrin Mim
Md. Imran Khan
Introduction

A simple regression model includes one independent and one dependent variable. But usually a
dependent variable is affected by more than one independent variable. So when a regression
model that includes two or more independent variables it will call a multiple regression model.

Our project is about to monthly auto insurance, driving experience, number of driving violations,
and age of drivers in a company. Here driving experience, number of violation and age are
independent variable and monthly auto insurance is the dependent variable.

As there is more than one independent variables in this project, we will use multiple regression
model for the project.

Multiple Regression

The research conducted in this paper focuses mainly on the schemes of multiple regressions
where the validity and variability of the independent variables are driving experience. The test
aims to figure out driver’s preference towards a particular violation and age where the dependent
variable .

The hypothesis of this research is:

H0 : Independent Variables are insignificant

H1: Independent Variables are significant


The variables of the hypothesis are:

Dependent Variable

 Monthly Auto Insurance

Independent Variables

 Driving Experience
 Number of Driving violations
 Age

The estimated regression model is :

Y = B0 + B1X1 + B2X2 + B3X3

Where,

Y = Monthly auto insurance


B = Constant
X1 = Driving experience
X2 = Number of driving violations
X3 = Age

So the estimated regression is finally :

Y (Monthly auto insurance) = B0 + B1X1 (Driving experience) + B2X2 ( Number of


driving violations) + B3X3 ( Age )
According to Minitab the estimated regression equation of the monthly auto insurance, driving
experience, num. of driving violation and age is :

Monthly auto insurance = 52.6 - 8.50 Driving experience + 12.41 Num. of driving violations


+ 3.80 Age

The summary from Minitab output is:

Model Summary

Mode Se R Square Adjusted R Std. Error of the Estimate


Square
1 14.4263 87.35% 84.64% 77.39%

The values of the standard deviation of errors, the coefficient of multiple determinations, are
given in the MINITAB solution which is given below:

Se =14.4263 and R2 = 87.35%

Interpretations of these values:

Interpretation of Se

Standard Deviation of Errors, Se =14.4263 tells us that the average difference of the sample
mean from the population mean is about 14.4263.

Interpretation of R2

We are 87.35% confident that the total variation of beginning salary can be explained by the use
of regression model.
ANOVA

Model df Sum of Mean F R2


Squares Square
Regression 3 20124 6708.0 32.23 87.35
Error 14 2914 208.1
Total 17 23038

Findings:

The ANOVA table gives the result of F-test. F-test measures the joint
significance of the independent variables whether the independent variables
taken together would influence the dependent variable or not.

In other words, the F-test is a procedure used to determine whether there is


more variability in the scores of one sample variance compared to another
sample variance.

Our SSR (Regression Sum of Squares) is 20124 while SSE (Error Sum of
Squares) is 2914, giving a SST (Total Sum of Squares) of 23038. The F-ratio
calculated by Minitab is 32.23.

If the calculated F-statistic is greater than the critical F-statistic, then the test is
significant.

Degrees of freedom are the number of observations minus the number of


constraints or assumptions needed to calculate a statistical item. The critical F-
statistic for numerator degrees of freedom 3 and denominator degrees of
freedom 14 is 4.24 (found from the F-table).
95% confidence intervals for the regression coefficients is:

From output of Minitab

b1 = -8.50 and Sb1 = 2.04

b2 = 12.41 and Sb2 = 2.78

b3 = 3.80 and Sb3 = 1.17

Level of confidence is (95%)


So the value of Alpha=5%

Area in each tail of the t distribution = (1-0.95)/2 = .025

Degrees of freedom = (n-k-1) = 18-3-1 = 14

From the t distribution table, the value of t for .025 t distribution curve is 1.985

95% Confidence Interval for B1: (b1 ± t × Sb1) = -8.50 ± 1.985 × 2.04

= -8.50 ± 4.0494

= -4.4506 to -12.5494

We are 95% confident that for driving experience, the monthly auto insurance changes by an
amount between, $-4.4506 to $-12.5494.

95% Confidence Interval for B2:(b2 ± t × Sb2) = 12.41 ± 1.985 × 2.78

= 12.41 ± 5.5183

= 17.9283 to 6.8917

We are 95% confidence that for a change in the num. of violation, the monthly auto insurance
changes by an amount between $17.9283 to $6.8917
95% Confidence Interval for B3: (b3 ± t ×Sb3) = 3.80 ± 1.985 × 1.17

= 3.80 ± 2.32245

= 6.12245 to 1.47755

We are 95% confidence that for a change the age , the monthly auto insurance changes by an
amount between $6.12245 to $1.47755

Decision and Conclusion:

The estimated regression equation of the beginning salary of the employees is

ŷ = 52.6 - 8.50 ×1 + 12.41 ×2+ 3.80 ×3

This equation is obtained from the MINITAB.

The important factors for determining the monthly auto insurance of the drivers are:

1. Driving experience

2. Num. of violation

3. Age

By using this equation, we found out the monthly auto insurance of drivers of different violation,
groups, experience, and age based on their insurance.

We know, the p-value for each term tests the null hypothesis that the coefficient is equal to zero
(no effect). A low p-value (< 0.05) indicates that you can reject the null hypothesis.

The p-value for the driving experience is 0.001. So, we reject the null hypothesis.

The p-value for the num. of violations is 0.001. So, we reject the null hypothesis.

The p-value for the age is 0.006. So, we do not reject the null hypothesis.

The p-value for the Regression is 0.000. So, there is no effect.

We can conclude that regression has no impact in on the monthly auto insurance.
Regression Analysis: Monthly auto ins versus Driving experience, Num. of driving ,
Age

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 3 20124 6708.0 32.23 0.000
Driving experience 1 3613 3612.6 17.36 0.001
Num. of driving violations 1 4142 4142.3 19.90 0.001
Age 1 2212 2211.8 10.63 0.006
Error 14 2914 208.1
Total 17 23038

Model Summary

S R-sq R-sq(adj) R-sq(pred)


14.4263 87.35% 84.64% 77.39%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 52.6 24.6 2.14 0.050
Driving experience -8.50 2.04 -4.17 0.001 6.89
Num. of driving violations 12.41 2.78 4.46 0.001 1.85
Age 3.80 1.17 3.26 0.006 5.71

Regression Equation

Monthly auto insurance = 52.6 - 8.50 Driving experience


+ 12.41 Num. of driving violations
+ 3.80 Age

Fits and Diagnostics for Unusual Observations

Monthly
auto
Obs insurance Fit Resid Std Resid
14 120.00 145.43 -25.43 -2.04 R

R Large residual

Potrebbero piacerti anche