Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
GROUP DETAILS
Statistics
Statistics 2
Table of Contents
Task 1: ............................................................................................................................................. 3
Task 2: ........................................................................................................................................... 14
References: .................................................................................................................................... 26
Statistics 3
Task 1:
1. Descriptive statistics:
The descriptive statistics is a significant tool which is mainly used for summarising all important
variables that are given in specific data sets (Holcomb, 2016). This can be calculated using the
different software like MS Excel or SPSS through which accurate results are obtained in the
most feasible manner. In the descriptive statistics, dependent or independent variables are also
analysed and on that basis, adequate relation is formed between the variables which are
prescribed in the data sets. Using this tool, one is able to calculate the various important
mathematical factors including median, mode, mean, range, standard deviation as well as
variance. Therefore, using the software, the different results can be evaluated as follows,
Mean 92.09090909
Median 87
Mode 76.82
Kurtosis -0.436922711
Skewness 0.509844144
Range 120
Minimum 40
Maximum 160
Statistics 4
Sum 1013
Count 11
Mean 92.09090909
Median 87
Mode 76.82
Kurtosis -0.436922711
Skewness 0.509844144
Range 120
Minimum 40
Maximum 160
Sum 1013
Count 11
Mean 72.3
Median 70
Mode 65.4
Kurtosis -0.958969069
Skewness 0.546077569
Range 90
Minimum 35
Maximum 125
Sum 723
Count 10
Mean 87
Median 97.5
Mode 118.5
Statistics 6
Kurtosis -0.485709919
Skewness 0.077293703
Range 115
Minimum 35
Maximum 150
Sum 870
Count 10
Mean 51.625
Median 49
Mode 43.75
Kurtosis -0.47673397
Skewness 0.633105979
Range 90
Statistics 7
Minimum 20
Maximum 110
Sum 826
Count 16
2. Frequency distribution:
0-30 0 0
30-60 4 0.307692308
60-90 4 0.307692308
90-120 3 0.230769231
120-150 2 0.153846154
Total 13
Relative
0-30 0 0
30-60 3 0.272727273
60-90 4 0.363636364
90-120 2 0.181818182
Statistics 8
120-150 1 0.090909091
150-180 1 0.090909091
Total 11
Relative
0-30 0 0
30-60 4 0.4
60-90 3 0.3
90-120 2 0.2
120-150 1 0.1
Total 10
Relative
0-30 0 0
30-60 3 0.3
60-90 1 0.1
90-120 5 0.5
120-150 1 0.1
Total 10
Statistics 9
Relative
0-30 6 0.375
30-60 5 0.3125
60-90 4 0.25
90-120 1 0.0625
Total 16
3. Results:
From the above, analysis, it has been analysed that there are various results obtained for different
businesses as given in the data set. For the business of pizza, mean is equal to 83, mode is 74,
median is 80, variance is 1165.16, standard deviation is 34.13 and the value for range comes out
as 105. In the similar manner, value for different business can also be calculated using the
software like SPSS and Excel. For the business of baker or donuts, mean is equal to 92.09, mode
is 76.82, median is 87, variance is 1512.69, standard deviation is 38.89 and the value for range
comes out as 120. For the business of shoe stores, mean is equal to 72.3, mode is 65.4, median is
70, variance is 983.78, standard deviation is 31.36 and the value for range comes out as 90. For
the business of gift shop, mean is equal to 87, mode is 118.5, median is 97.5, variance is
1289.11, standard deviation is 35.90 and the value for range comes out as 115. In addition, for
the business of pet stores, mean is equal to 51.625, mode is 43.75, median is 49, variance is
733.05, standard deviation is 27.07 and the value for range comes out as 90.
Further, from the relative frequency histogram for the different type of businesses, it has been
analysed that for pizza business, both intervals that are 30-60 and 60-90 have the highest
frequency of 4 and also have the highest relative frequency which is equal to 0.307. In the same
manner, for pizza business, both intervals that are 30-60 and 60-90 have the highest frequency of
4 and also have the highest relative frequency which is equal to 0.307. In the same or bakers or
Statistics 12
donuts business, interval of 30-60 has the highest frequency that is 4 and has the highest relative
frequency of 0.363 as observed from the plotted histogram. For shoe stores, interval of 30-60 has
the highest frequency that is 4 and has the highest relative frequency of 0.40 as observed from
the plotted histogram. For gift shop, interval of 90-120 has the highest frequency that is 5 and
has the highest relative frequency of 0.50 as observed from the plotted histogram. For pet store,
interval of 0-30 has the highest frequency that is 6 and has the highest relative frequency of
4. Significance test:
In order to test the significance for the starting costs of different business, the test has been
conducted using the SPSS software in order to analyze the difference using the value of mean,
Runs Test
X1 X2 X3 X4 X5
Total Cases 13 11 10 10 16
Number of Runs 2 2 2 2 2
Runs Test
X1 X2 X3 X4 X5
Total Cases 13 11 10 10 16
Number of Runs 2 2 2 2 2
a. Median
Runs Test 2
X1 X2 X3 X4 X5
Total Cases 13 11 10 10 16
Statistics 14
Number of Runs 2 2 2 2 2
a. Mean
Using the value of mean for the different businesses related to their starting costs, it has been
analyzed that starting costs for most of the business in the given data set is mainly comes in the
interval of 30-60 million dollars which indicates that there is no significant difference in the
Task 2:
1. Regression equation:
Correlations
Statistics 15
X1 X2 X3 X4 X5 X6
N X1 27 27 27 27 27 27
X2 27 27 27 27 27 27
X3 27 27 27 27 27 27
X4 27 27 27 27 27 27
X5 27 27 27 27 27 27
X6 27 27 27 27 27 27
Statistics 16
Variables Entered/Removedb
Variables Variables
b. Dependent Variable: X1
Regression equation is presented over the analysis of given data for the franchise for all greens
pvt ltd. on basis of the different variables like annual sales, floor area, advertising and
expenditure includes the families in different areas. On the basis of above assessment of
regression, it can be interpreted that the X1 data is the variable over the different variables which
are independent for each franchisee store. It can be said that there is a positive relation between
the X1 and X1 which dictates on the high coefficient co-relation for variables. With this, it can
also be evaluated that there is also a significant relationship between the number of families and
sales through tapping the competitors in the perspective location (Draper and Smith, 2014). On
the other hand, it is also intensified that the if there is the higher cost is consumed in the
inventory and area square for the advertisement than the sales is also influenced in positive
manner. Along with this, it is also found out that if there is no competition or low competition for
Statistics 17
the store than the performance of franchisee store is very high but if there are several competitors
in the perspective market than the sales is declined over the variability nature of the dependent
constraints. It is also evaluated that the positive performance for franchisee also derives the high
cost for business to operate in the highly competitive market to pull the customer by better
strategy for advertising. Apart from this, the negative relationship between the X6 also assessed
over the depended and independent variables factor. With this, the higher negative and low
relationship exist between the X1 and X6 to estimating the quantitative measure for the multiple
2. Applicability of model:
The regression model is used to determine and navigate the relationship between the two and
more variables in order to establish the prediction measure for the concerned data and
performance for two variables are also developed. This model is best fit over the given data for
business of franchisee to equipping the statistical comprehension. With this, this model also
determines the quantitative relationship between the two variables over the numerical measures.
With this, in concurrent competitive time, regression model is effectively used by the managers
of large organization in order to optimizing the concerned business outcomes process. With this,
it is considered the indispensable tool for the analysis of statistical evaluation of data under
variant situation of a business. The above data and results depicts that it determine the R square
with reference to the mode calculation for the adjusted data management. With this, it is also
useful for the establishment of linear regression between the two constraints. The above diagram
Statistics 18
dictates the R square is higher than the adjusted R for the mean square. In addition to this, it also
3. Hypothesis test:
Hypothesis testing is the conduction of analysis, over this prediction is assessed to using the
mean values. With this, hypothesis is processed with the experimentation of the guess in relevant
manner. Hypothesis statement is that there is no significant relationship between the dependent
and any of the independent variables which is crucial thing to proceed with the testing of
prediction (Science, 2017). On the other hand, the alternative hypothesis is also projected in
order to determine the relevancy of chosen statistical problem which supports to frame the
criteria for hypothesis to accept or reject the null hypothesis. In order to test this hypothesis, one
142 81 245 45
141 17 28 278
( Science, 2017)
Here, mean value is chosen from X5 variable which is 51.625. With this, the standard deviation
On the basis of scrtuing the above depicted information it can be said that there is low and
negative relationship exist between the dependant and independent variable. It is also dictated
that the upper and lower values are also negative so that the null hypothesis is accepted over the
selected criteria.
4. Slope coefficients:
Statistics 20
ANOVAb
Sum of
Total 959080.352 26
X3
b. Dependent Variable: X1
Slope coefficient is referred to the coefficient of an independent variable which dictates the
changes in y values due to change in x constraints. Over the equation of regression analysis, it
can be stated the valuation of the data establish the correlation coefficient.
5. Confidence interval:
Statistics 21
Confidence interval can be termed as the single sample which explores the probability of the data
in more realistic manner. With this, the confidence interval is calculated over the variable and
different constraints.
For this, mean is chosen for X1 as 286.574 and the Standard deviation is 16.26
Significance test for the derived variables is measured over the defined sample size for the given
data.
Statistics 22
7. Model re-estimation:
Correlations
Statistics 23
X1 X3 X6
X3 .000 . .000
X6 .000 .000 .
N X1 27 27 27
X3 27 27 27
X6 27 27 27
ANOVAb
Sum of
Total 959080.352 26
b. Dependent Variable: X1
Coefficient Correlationsa
Model X6 X3
X3 .807 1.000
X3 .161 .005
a. Dependent Variable: X1
8. Annual sales:
On the basis of rearranging the model for co-variance to equate the regression analysis the sales
can be predicted for franchisee as with 1000 ft. square, $150000 inventory, expenses for $5000
On the basis of devised information in the sheet the sales can be predicted as $436000 from the
above dictated and rearranged data valuation for the X variable and constant variables.
Statistics 26
References:
Draper, N. and Smith, H. (2014). Applied Regression Analysis. USA: John Wiley and Sons.
https://onlinecourses.science.psu.edu/stat200/node/54