Group Assisgnment Statistics

Statistics 1
STATISTICS FOR BUSINESS DECISION (HI 6007)

GROUP ASSIGNMENT
GROUP DETAILS
MAULIK THAKAR EGU8594
KIRANDEEP KAUR EMV8687
SHOUGANG CHEN DC5003
Statistics
Statistics 2
Table of Contents
Task 1: ............................................................................................................................................. 3
Task 2: ........................................................................................................................................... 14
References: .................................................................................................................................... 26
Statistics 3
Task 1:
1. Descriptive statistics:
The descriptive statistics is a significant tool which is mainly used for summarising all important
variables that are given in specific data sets (Holcomb, 2016). This can be calculated using the
different software like MS Excel or SPSS through which accurate results are obtained in the
most feasible manner. In the descriptive statistics, dependent or independent variables are also
analysed and on that basis, adequate relation is formed between the variables which are
prescribed in the data sets. Using this tool, one is able to calculate the various important
mathematical factors including median, mode, mean, range, standard deviation as well as
variance. Therefore, using the software, the different results can be evaluated as follows,
For X1= Pizza:
X2= startup costs for baker/donuts
Mean 92.09090909
Standard Error 11.72677941
Median 87
Mode 76.82
Standard Deviation 38.89332731
Sample Variance 1512.690909
Kurtosis -0.436922711
Skewness 0.509844144
Range 120
Minimum 40
Maximum 160
Statistics 4
Sum 1013
Count 11
For X2= Baker/Donuts:
X2= startup costs for baker/donuts
Mean 92.09090909
Median 87
Mode 76.82
Kurtosis -0.436922711
Range 120
Minimum 40
Maximum 160
Sum 1013
Count 11
For X3= Shoe stores:

Statistics 5
X3= startup costs for shoe stores
Mean 72.3
Median 70
Mode 65.4
Kurtosis -0.958969069
Range 90
Minimum 35
Maximum 125
Sum 723
Count 10
For X4= Gift shops:
X4= startup costs for gift shops
Mean 87
Median 97.5
Mode 118.5
Statistics 6
Kurtosis -0.485709919
Range 115
Minimum 35
Maximum 150
Sum 870
Count 10
For X5= Pet stores:
X5= startup costs for pet stores
Mean 51.625
Median 49
Mode 43.75
Kurtosis -0.47673397
Range 90
Statistics 7
Minimum 20
Maximum 110
Sum 826
Count 16
2. Frequency distribution:
a. Frequency as well as Relative Frequency Distributions:
For X1= Pizza:
X1 Interval Frequency Relative frequency
0-30 0 0
30-60 4 0.307692308
60-90 4 0.307692308
90-120 3 0.230769231
120-150 2 0.153846154
Total 13
Relative
X2 Interval Frequency frequency
0-30 0 0
30-60 3 0.272727273
60-90 4 0.363636364
90-120 2 0.181818182
Statistics 8
120-150 1 0.090909091
150-180 1 0.090909091
Total 11
Relative
0-30 0 0
30-60 4 0.4
60-90 3 0.3
90-120 2 0.2
120-150 1 0.1
Total 10
For X4= Gift shops:
Relative
0-30 0 0
30-60 3 0.3
60-90 1 0.1
90-120 5 0.5
120-150 1 0.1
Total 10
Statistics 9
For X5= Pet stores:
Relative
0-30 6 0.375
30-60 5 0.3125
60-90 4 0.25
90-120 1 0.0625
Total 16
b. Relative Frequency Histogram:
For X1= Pizza:
Relative frequency Histogram

0.4
0.3
0.2
Relative frequency
0.1
0
0-30 30-60 60-90 90-120 120-150

Statistics 10

0.4
0.3
0.2
Relative frequency
0.1
0
0-30 30-60 60-90 90-120 120-150 150-180

0.6
0.4
0.2 Relative frequency
0
0-30 30-60 60-90 90-120 120-150
For X4= Gift shops:

0.6
0.4
0.2 Relative frequency
0
0-30 30-60 60-90 90-120 120-150
For X5= Pet stores:

Statistics 11

0.4
0.3
0.2
Relative frequency
0.1
0
0-30 30-60 60-90 90-120
3. Results:
From the above, analysis, it has been analysed that there are various results obtained for different
businesses as given in the data set. For the business of pizza, mean is equal to 83, mode is 74,
median is 80, variance is 1165.16, standard deviation is 34.13 and the value for range comes out
as 105. In the similar manner, value for different business can also be calculated using the
software like SPSS and Excel. For the business of baker or donuts, mean is equal to 92.09, mode
is 76.82, median is 87, variance is 1512.69, standard deviation is 38.89 and the value for range
comes out as 120. For the business of shoe stores, mean is equal to 72.3, mode is 65.4, median is
70, variance is 983.78, standard deviation is 31.36 and the value for range comes out as 90. For
the business of gift shop, mean is equal to 87, mode is 118.5, median is 97.5, variance is
1289.11, standard deviation is 35.90 and the value for range comes out as 115. In addition, for
the business of pet stores, mean is equal to 51.625, mode is 43.75, median is 49, variance is
733.05, standard deviation is 27.07 and the value for range comes out as 90.
Further, from the relative frequency histogram for the different type of businesses, it has been
analysed that for pizza business, both intervals that are 30-60 and 60-90 have the highest
frequency of 4 and also have the highest relative frequency which is equal to 0.307. In the same
manner, for pizza business, both intervals that are 30-60 and 60-90 have the highest frequency of
4 and also have the highest relative frequency which is equal to 0.307. In the same or bakers or
Statistics 12
donuts business, interval of 30-60 has the highest frequency that is 4 and has the highest relative
frequency of 0.363 as observed from the plotted histogram. For shoe stores, interval of 30-60 has
the highest frequency that is 4 and has the highest relative frequency of 0.40 as observed from
the plotted histogram. For gift shop, interval of 90-120 has the highest frequency that is 5 and
has the highest relative frequency of 0.50 as observed from the plotted histogram. For pet store,
interval of 0-30 has the highest frequency that is 6 and has the highest relative frequency of
0.375 as observed from the plotted histogram.
4. Significance test:
In order to test the significance for the starting costs of different business, the test has been
conducted using the SPSS software in order to analyze the difference using the value of mean,
median as well as mode (Wetcher-Hendricks, 2011).
Runs Test
X1 X2 X3 X4 X5
Test Valuea 80.00 87.00 70.00 97.50 49.00
Cases < Test Value 6 5 5 5 8
Cases >= Test

7 6 5 5 8
Value
Total Cases 13 11 10 10 16
Number of Runs 2 2 2 2 2
Z -2.893 -2.537 -2.348 -2.348 -3.364
Asymp. Sig. (2-

.004 .011 .019 .019 .001
tailed)
Statistics 13
Runs Test
X1 X2 X3 X4 X5
Test Valuea 80.00 87.00 70.00 97.50 49.00
Cases >= Test

7 6 5 5 8
Value
Total Cases 13 11 10 10 16
Z -2.893 -2.537 -2.348 -2.348 -3.364
Asymp. Sig. (2-

.004 .011 .019 .019 .001
tailed)
a. Median
Runs Test 2
X1 X2 X3 X4 X5
Test Valuea 83.0000 92.0909 72.3000 87.0000 51.6250
Cases >= Test

6 4 5 6 7
Value
Total Cases 13 11 10 10 16
Statistics 14
Z -2.893 -2.488 -2.348 -2.318 -3.356
Asymp. Sig. (2-

.004 .013 .019 .020 .001
tailed)
a. Mean
Using the value of mean for the different businesses related to their starting costs, it has been
analyzed that starting costs for most of the business in the given data set is mainly comes in the
interval of 30-60 million dollars which indicates that there is no significant difference in the
starting costs for each type of business.
Task 2:
1. Regression equation:
Correlations
Statistics 15
X1 X2 X3 X4 X5 X6
Pearson X1 1.000 .894 .946 .914 .954 -.912
Correlation X2 .894 1.000 .844 .749 .838 -.766
X3 .946 .844 1.000 .906 .864 -.807
X4 .914 .749 .906 1.000 .795 -.841
X5 .954 .838 .864 .795 1.000 -.870
X6 -.912 -.766 -.807 -.841 -.870 1.000
Sig. (1-tailed) X1 . .000 .000 .000 .000 .000
X2 .000 . .000 .000 .000 .000
X3 .000 .000 . .000 .000 .000
X4 .000 .000 .000 . .000 .000
X5 .000 .000 .000 .000 . .000
X6 .000 .000 .000 .000 .000 .
N X1 27 27 27 27 27 27
X2 27 27 27 27 27 27
X3 27 27 27 27 27 27
X4 27 27 27 27 27 27
X5 27 27 27 27 27 27
X6 27 27 27 27 27 27
Statistics 16
Variables Entered/Removedb
Variables Variables
Model Entered Removed Method
1 X6, X2, X4,

. Enter
a
X5, X3
a. All requested variables entered.
b. Dependent Variable: X1
Regression equation is presented over the analysis of given data for the franchise for all greens
pvt ltd. on basis of the different variables like annual sales, floor area, advertising and
expenditure includes the families in different areas. On the basis of above assessment of
regression, it can be interpreted that the X1 data is the variable over the different variables which
are independent for each franchisee store. It can be said that there is a positive relation between
the X1 and X1 which dictates on the high coefficient co-relation for variables. With this, it can
also be evaluated that there is also a significant relationship between the number of families and
sales through tapping the competitors in the perspective location (Draper and Smith, 2014). On
the other hand, it is also intensified that the if there is the higher cost is consumed in the
inventory and area square for the advertisement than the sales is also influenced in positive
manner. Along with this, it is also found out that if there is no competition or low competition for
Statistics 17
the store than the performance of franchisee store is very high but if there are several competitors
in the perspective market than the sales is declined over the variability nature of the dependent
constraints. It is also evaluated that the positive performance for franchisee also derives the high
cost for business to operate in the highly competitive market to pull the customer by better
strategy for advertising. Apart from this, the negative relationship between the X6 also assessed
over the depended and independent variables factor. With this, the higher negative and low
relationship exist between the X1 and X6 to estimating the quantitative measure for the multiple
variable on based of franchisee shop.
2. Applicability of model:
The regression model is used to determine and navigate the relationship between the two and
more variables in order to establish the prediction measure for the concerned data and
performance for two variables are also developed. This model is best fit over the given data for
business of franchisee to equipping the statistical comprehension. With this, this model also
determines the quantitative relationship between the two variables over the numerical measures.
With this, in concurrent competitive time, regression model is effectively used by the managers
of large organization in order to optimizing the concerned business outcomes process. With this,
it is considered the indispensable tool for the analysis of statistical evaluation of data under
variant situation of a business. The above data and results depicts that it determine the R square
with reference to the mode calculation for the adjusted data management. With this, it is also
useful for the establishment of linear regression between the two constraints. The above diagram
Statistics 18
dictates the R square is higher than the adjusted R for the mean square. In addition to this, it also
adaptable by using a single predictor for multiple variables.
3. Hypothesis test:
Hypothesis testing is the conduction of analysis, over this prediction is assessed to using the
mean values. With this, hypothesis is processed with the experimentation of the guess in relevant
manner. Hypothesis statement is that there is no significant relationship between the dependent
and any of the independent variables which is crucial thing to proceed with the testing of
prediction (Science, 2017). On the other hand, the alternative hypothesis is also projected in
order to determine the relevancy of chosen statistical problem which supports to frame the
criteria for hypothesis to accept or reject the null hypothesis. In order to test this hypothesis, one
variable factor as dependent and another independed variable.
Coeffici Standard t Stat P- Lower Upper Lower Upper
ents Error value 95% 95% 95.0% 95.0%
Intercept - 30.15022 - 0.538 - 43.8414 - 43.8414
18.8594 791 0.625514 372 81.560 17 81.5602 17
142 81 245 45
X 16.2015 3.544437 4.570986 0.000 8.8305 23.5726 8.83051 23.5726
Variable 1 736 306 073 166 127 344 27 344
X 0.17463 0.057606 3.031540 0.006 0.0548 0.29443 0.05483 0.29443
Variable 2 515 068 961 347 368 353 678 353
X 11.5262 2.532103 4.552053 0.000 6.2604 16.7920 6.26047 16.7920
Variable 3 69 3 24 174 72 661 197 661

Statistics 19
X 13.5803 1.770456 7.670514 1.61E- 9.8984 17.2621 9.89844 17.2621
Variable 4 129 609 392 07 468 79 684 79
X - 1.705426 - 0.005 - - -8.8576 -
Variable 5 5.31097 54 3.114160 249 8.8576 1.76434 1.76434
141 17 28 278
In order to proceed with the testing,
( Science, 2017)
Here, mean value is chosen from X5 variable which is 51.625. With this, the standard deviation
is also calculated as 1.7054. P value is calculated as .005249.
On the basis of scrtuing the above depicted information it can be said that there is low and
negative relationship exist between the dependant and independent variable. It is also dictated
that the upper and lower values are also negative so that the null hypothesis is accepted over the
selected criteria.
4. Slope coefficients:
Statistics 20
ANOVAb
Sum of
Model Squares df Mean Square F Sig.
1 Regression 952538.942 5 190507.788 611.590 .000a
Residual 6541.410 21 311.496
Total 959080.352 26
a. Predictors: (Constant), X6, X2, X4, X5,
X3
Slope coefficient is referred to the coefficient of an independent variable which dictates the
changes in y values due to change in x constraints. Over the equation of regression analysis, it
can be stated the valuation of the data establish the correlation coefficient.
5. Confidence interval:
Statistics 21
Confidence interval can be termed as the single sample which explores the probability of the data
in more realistic manner. With this, the confidence interval is calculated over the variable and
different constraints.
Calculation of confidence interval = X Z *s/(n) (O'Gorman, 2012)
For this, mean is chosen for X1 as 286.574 and the Standard deviation is 16.26
number of samples are 27
= 286.574 + 1.60 *16.26/ 27 = 288.174 *3.13 = 901.43
6. Significance test for slope coefficients:
Significance test for the derived variables is measured over the defined sample size for the given
data.
Statistics 22
7. Model re-estimation:
Correlations
Statistics 23
X1 X3 X6
Pearson X1 1.000 .946 -.912
Correlation X3 .946 1.000 -.807
X6 -.912 -.807 1.000
Sig. (1-tailed) X1 . .000 .000
X3 .000 . .000
X6 .000 .000 .
N X1 27 27 27
X3 27 27 27
X6 27 27 27
ANOVAb
Sum of
Model Squares df Mean Square F Sig.
1 Regression 918438.538 2 459219.269 271.180 .000a
Residual 40641.814 24 1693.409

Statistics 24
Total 959080.352 26
a. Predictors: (Constant), X6, X3
Coefficient Correlationsa
Model X6 X3
1 Correlations X6 1.000 .807
X3 .807 1.000
Covariances X6 7.805 .161
X3 .161 .005
a. Dependent Variable: X1
8. Annual sales:
On the basis of rearranging the model for co-variance to equate the regression analysis the sales
can be predicted for franchisee as with 1000 ft. square, $150000 inventory, expenses for $5000
with the 2 competitors in perspective market.

Statistics 25
On the basis of devised information in the sheet the sales can be predicted as $436000 from the
above dictated and rearranged data valuation for the X variable and constant variables.
Statistics 26
References:
Holcomb, Z. (2016). Fundamentals of Descriptive Statistics. UK: Routledge.
Wetcher-Hendricks, D. (2011). Analyzing Quantitative Data: An Introduction for Social
Researchers. USA: John Wiley and Sons.
Henkel, R. (2017). The Significance Test Controversy: A Reader. UK: Routledge.
O'Gorman, T. (2012). Adaptive Tests of Significance Using Permutations of Residuals with R
and SAS. USA: Jon Wiley and Sons.
Draper, N. and Smith, H. (2014). Applied Regression Analysis. USA: John Wiley and Sons.
Science, (2017). Hypothesis Testing. Retrieved from:
https://onlinecourses.science.psu.edu/stat200/node/54

Group Assisgnment Statistics

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Group Assisgnment Statistics

Caricato da

Copyright:

Formati disponibili

Statistics 1

STATISTICS FOR BUSINESS DECISION (HI 6007)

MAULIK THAKAR EGU8594

KIRANDEEP KAUR EMV8687

SHOUGANG CHEN DC5003

For X1= Pizza:

X2= startup costs for baker/donuts

Standard Error 11.72677941

Standard Deviation 38.89332731

Sample Variance 1512.690909

For X2= Baker/Donuts:

X2= startup costs for baker/donuts

Standard Error 11.72677941

Standard Deviation 38.89332731

Sample Variance 1512.690909

For X3= Shoe stores:

X3= startup costs for shoe stores

Standard Error 9.918613254

Standard Deviation 31.36540911

Sample Variance 983.7888889

For X4= Gift shops:

X4= startup costs for gift shops

Standard Error 11.3539029

Standard Deviation 35.9041935

Sample Variance 1289.111111

For X5= Pet stores:

X5= startup costs for pet stores

Standard Error 6.76872403

Standard Deviation 27.07489612

Sample Variance 733.05

a. Frequency as well as Relative Frequency Distributions:

For X1= Pizza:

X1 Interval Frequency Relative frequency

For X2= Baker/Donuts:

X2 Interval Frequency frequency

For X3= Shoe stores:

X3 Interval Frequency frequency

For X4= Gift shops:

X4 Interval Frequency frequency

For X5= Pet stores:

X5 Interval Frequency frequency

b. Relative Frequency Histogram:

For X1= Pizza:

Relative frequency Histogram

For X2= Baker/Donuts:

Relative frequency Histogram

For X3= Shoe stores:

Relative frequency Histogram

For X4= Gift shops:

Relative frequency Histogram

For X5= Pet stores:

Relative frequency Histogram

0.375 as observed from the plotted histogram.

median as well as mode (Wetcher-Hendricks, 2011).

Test Valuea 80.00 87.00 70.00 97.50 49.00

Cases < Test Value 6 5 5 5 8

Cases >= Test

Z -2.893 -2.537 -2.348 -2.348 -3.364

Asymp. Sig. (2-

Test Valuea 80.00 87.00 70.00 97.50 49.00

Cases < Test Value 6 5 5 5 8

Cases >= Test

Z -2.893 -2.537 -2.348 -2.348 -3.364

Asymp. Sig. (2-

= 286.574 + 1.60 16.26/ 27 = 288.174 3.13 = 901.43