Sei sulla pagina 1di 35

STATISTICAL INFERENCE TERM

REPORT

by
Mehlub Usmani
(ID #21675)

Supervisor
Mr. Shahbaz Khan

College of Business Management


Institute of Business Management (IoBM)

Karachi

August 2019

1
Introduction
The reason for this report is to see how to utilize SPSS by applying the
measurements. SPSS measurable bundle is a standout amongst the most famous
factual bundles which can perform exceptionally complex information control and
investigation with basic directions. It is every now and again utilized in the
sociology. SPSS has four windows - Data proofreader; Output watcher; Syntax
editorial manager; Script window. Numerous tasks can be performed with the
menus and discourse boxes yet some amazing highlights are accessible just with
direction sentence structure. Charts order is utilized only in SPSS to make diagrams.
SPSS more often than not makes ordinarily utilized designs in the fields of
sociology, for example, histograms, dissipate plots, and relapse line, and so forth.
The Graphs direction permits changing parts of tomahawks, including content,
changing shading and text style, replicating, gluing, and sending out, and so forth.
You could likewise physically altering designs for distribution.
In the meanwhile, there are few topics which is performed through mini tab.
Minitab is a software which calculates Minitab Statistical Software, has
comprehensive yet easy-to-use tools to help companies run smarter and more
thorough data analysis. Even without advanced expertise in statistics, those who
use the software can draw substantial data easily. Its ease of use helps not just
businesses, but also colleges and universities when it comes to teaching statistics
and data analysis.
The flagship software’s supplementary application, Minitab Quality Trainer, aids
users in further analyzing derived statistics. It serves as a cost-effective way to
study statistics anytime users are online. Aside from this, Companion by Minitab
provides tools so that users can present data with confidence.
Aside from the main functions, Minitab is recognized in the statistics industry for
providing superior customer service. User reviews attest to its exceptional training
and reliable customer support. Companies are also able to maximize usage because
of the software’s multi-user licensing.

2
Sampling Distribution
Chap# 6

Question 1: Sampling [With Replacement]

A trainer tested his athletes during a 2-mile-long run and the results of 4 athletes were 20,
15, 18, and 22 minutes for completing the laps. Let’s assume that the four athletes
constitute the population.

Compute;

a) Obtain The Sample Distribution Of The Mean With Sample Size 2 With Replacement
b) Sample And Population Mean Of The Data
c) Compute Sample And Population Variances And Standard Deviation
d) Verify:
(i) E(X¯) =μ
(ii) Var(X¯) =σ2n(N−nN−1)

Solution:

Componen
ts A B C D Σ
X= 20 15 18 22 75
3.2
X-μ = 1.25 -3.8 -0.8 5  
1.562 14. 0.5 10. 26.7
Σ(X-μ)ᶺ2 = 5 1 6 6 5

a) Obtain The Sample Distribution Of The Mean With Sample Size 2 With
Replacement

Notations Sample Ẋi Ẋi-μ (Ẋi-μ)ᶺ2


A,A 20,20 20 20 20 1 1.5625
A,B 20,15 20 15 18 -1 1.5625
A,C 20,18 20 18 19 0 0.0625
A,D 20,22 20 22 21 2 5.0625
B,A 15,20 15 20 18 -1 1.5625
B,B 15,15 15 15 15 -4 14.0625
B,C 15,18 15 18 17 -2 5.0625
B,D 15,22 15 22 19 0 0.0625
C,A 18,20 18 20 19 0 0.0625
C,B 18,15 18 15 17 -2 5.0625
C,C 18,18 18 18 18 -1 0.5625
C,D 18,22 18 22 20 1 1.5625
D,A 22,20 22 20 21 2 5.0625
D,B 22,15 22 15 19 0 0.0625
D,C 22,18 22 18 20 1 1.5625

3
D,D 22,22 22 22 22 3 10.5625
Total       300   53.5

b) Sample And Population Mean Of The Data:

ΣX
Ẋ= 300 = 75
N
N= 16 = 4
18.7
μẊ = 5 μ= 18.75

c) Compute Sample And Population Variances And Standard Deviation

(SX)ᶺ
Σ(X-μ)ᶺ2 53.5 3.5666
2=
67
  N-1 15

1.888562
SX 063

(σX)ᶺ2 = Σ(X-μ)ᶺ2 26.75 6.6875


  N-1 4-1

(σẊ) = 2.586

d) Verify:
(i) E(X¯) =μ

18.7
μẊ= 5
18.7
μ= 5

μẊ = μ

(ii) Var(X¯) =σ2n(N−nN−1)

SX = 2.586
σẊ = 1.828
1.414
√2 = 2

4
Verificati
on; σẊ = σ/√n
1.828 1.8285
= 96

Conclusion:

Hence, verified that Var(X¯) =σ2n(N−nN−1) & μẊ = μ

5
Question 2: Sampling [Without Replacement]

Draw all possible samples of size 2 without replacement from a population consisting of 3, 6,
9, 12, 15 & 18. Form the sampling distribution of sample means and verify the results.

Solution:

Compone
nts A B C D E F Σ
X= 3.00 6.00 9.00 12.00 15.00 18.00 63.00
X-μ = -7.5 -4.5 -1.5 1.5 4.5 7.5  
Σ(X-μ)ᶺ2
= 56.25 20.25 2.25 2.25 20.25 56.25 157.50

Notation
s Sample Ẋi Ẋi-μ (Ẋi-μ)ᶺ2
6.0
A,B 3,6 3.00 0 4.50 -6.000 36
9.0
A,C 3,9 3.00 0 6.00 -4.500 20.25
12.0
A,D 3,12 3.00 0 7.50 -3.000 9
15.0
A,E 3,15 3.00 0 9.00 -1.500 2.25
18.0
A,F 3,18 3.00 0 10.50 0.000 0
9.0
B,C 6,9 6.00 0 7.50 -3.000 9
12.0
B,D 6,12 6.00 0 9.00 -1.500 2.25
15.0
B,E 6,15 6.00 0 10.50 0.000 0
18.0
B,F 6,18 6.00 0 12.00 1.500 2.25
12.0
C,D 9,12 9.00 0 10.50 0.000 0
15.0
C,E 6,15 9.00 0 12.00 1.500 2.25
18.0
C,F 6,18 9.00 0 13.50 3.000 9
12,1 12.0 15.0
D,E 5 0 0 13.50 3.000 9
16,1 12.0 18.0
D,F 8 0 0 15.00 4.500 20.25
16,1 15.0 18.0
E,F 8 0 0 16.50 6.000 36
Total       158   157.5

Sample And Population Mean Of The Data:

6
Ẋ= 158 ΣX 63
N= 15 N 6
μẊ = 10.5 μ= 10.5

Compute Sample And Population Variances And Standard Deviation

Sample 11.
Variance 25
Sample 3.2
SD 4

Populatio
n 26.
Variance 25
Populatio 5.1
n SD 234

Σ(X- 157.5
(σ)ᶺ2 = μ)ᶺ2 0 26.25
  N 6

Σ(Ẋi-
(SẊ)ᶺ2 = μ)ᶺ2 157.5 11.25
  N 15-1

σ= 5.12
SẊ = 3.24
√2 = 1.4142
√N-n = 2
2.2360
√N-1 = 6

Verificati σ
on σẊ = *√N-n
3.24
√n*√
N-1

Conclusion:

Hence, verified that Var(X¯) =σ2n(N−nN−1) & μẊ = μ

7
Confidence Intervals for 1
Sample
Chap# 7

Question 3: Confidence Interval for One Sample Mean

A research done by Engro Polymers shows that the distraction suffered by 32 employees is
2, 5, 4, 3, 1, 2, 2, 5, 4, 3, 2, 5, 4, 3, 1, 2, 2, 5, 4, 3, 2, 5, 4, 3, 1, 2, 2, 5, 4, 3, 1 and 2 hours
per day at work station having population mean 4.90. Distraction is done in the form of
emails, calls, visits, activities etc. Find 91% confidence interval for true mean.

Assumptions:
a) Dependent variable was measured at the interval or ratio level (i.e., continuous).
b) There were no significant outliers.
c) The variable was approximately normally distributed
d) The sample is a random sample

8
Solution:

Conclusion:

Mean distraction score (3.00 ± 0.23760) was lower than the population 'normal' depression
score of 4.9. Distraction was statistically significantly lower by 1.806 (91% CI, 2.5154 to
3.4846) than a normal depression score of 4.9.

9
Question 4: Confidence Interval for One Sample Proportion

A survey conducted by SMM GROUP of 1000 students at IOBM. They found that 532 students
paid for their education by merit scholarship. Find the 95% confidence of the true proportion
of students who paid for their education by scholarship

Assumptions:

 The data are continuous (not discrete).


 The data follow the normal probability distribution.
 The two samples are independent.
 Both samples are simple random samples from their respective populations. Each
individual in the population has an equal probability of being selected in the sample.

Solution:

Conclusion:

The confidence interval (CI) for the proportion (p) is between 0.501 and 0.5629 which
equates to 50.1% and 56.29%

Therefore, there is a 95% chance that the true proportion of students who paid for their
education by scholarship is between 501 and 562.9 out of the 1000 total students.

10
Question 5: Confidence Interval for One Sample Variance

The number of calories per chips packet of 100 gram selected from different brands are
listed below. Estimate the true population variance and standard deviation for the number of
calories per pita chips with 95% confidence.

536 256 780


234 756 378
110 355 289
510 267 367
330 436 440
547 378 313

Assumptions:

 The data are continuous (not discrete).


 The data follow the normal probability distribution.
 The two samples are independent.
 Both samples are simple random samples from their respective populations. Each
individual in the population has an equal probability of being selected in the sample.

Solution:

Conclusion:

The 95% confidence interval shows that a likely range for the population standard deviation
is between 129 and 258 calories.

11
Hypothesis Testing
Chap# 8

Question 6: Hypothesis Testing for One Sample Mean

Escobar performed a study to validate a translated version of the Western Ontario and
McMaster University index (WOMAC) questionnaire used with Spanish-speaking patients with
hip or knee osteoarthritis. For the 5 women classified with severe hip pain, the WOMAC
mean function score was 16.8 with standard deviation of 6.3 (individual scores being; 18,
20, 25, 11 & 10), we wish to know if we may conclude that the mean function score for a
population of similar women subjects with sever hip pain is other than as claimed to be 30.

Assumptions:

1) Alpha is 0.05
2) Sample size 5
3) CI 95%
4) The data are continuous (not discrete).
5) The data follow the normal probability distribution.
6) The two samples are independent.
7) Both samples are simple random samples from their respective populations. Each
individual in the population has an equal probability of being selected in the sample.

Solution:

H0: μ = 30
H1: μ  ≠ 30

Conclusion:

12
Since the p-value 0.004 (significant value) is less than alpha (0.05), we will reject the null
hypothesis and accept the alternate hypothesis.

13
Question 7: Hypothesis Testing for One Sample Proportion

For Pakistanis using library services, the Defense Library Association (ALA) claims that 75%
borrow books. A library director feels that this is not true so he randomly selects 50
borrowers and finds that 60 borrowed books. Can he show that the DLA claim is incorrect?
Use alpha 0.05.

Assumptions:

 The data are continuous (not discrete).


 The data follow the normal probability distribution.
 The two samples are independent.
 Both samples are simple random samples from their respective populations. Each
individual in the population has an equal probability of being selected in the sample.

Solution:

H0: p = 0.75


H1: p  ≠ 0.75

Conclusion:

Since the P-value (significant value) 0.136 is more than the significance of 0.05, the null
hypothesis is not rejected. With 95% confidence, the evidence is not strong enough to say
the population proportion is equal to 75%.

14
Question 8: Hypothesis Testing for One Sample Variance

A sociologist wishes to see if it is true that for a certain group of professional women, the
average age at which they have their first child is 28.6 years. A random sample of 36 women
is selected, and their ages at the birth of their first child are recorded. At alpha 0.05, does
the evidence refute the sociologist’s assertion?

3
32 39 1
3
46 41 0
2
36 35 8
2
45 31 9
Assumptions: 2
 The data are 55 26 7 continuous (not discrete).
 The data follow the normal probability distribution.
 The two samples are independent.
 Both samples are simple random samples from their respective populations. Each
individual in the population has an equal probability of being selected in the sample.

Solution:

H0: σ2= 28.5


H1: σ2≠ 28.6

15
Conclusion:

Since the p-value 0.001 (significant value) is less than alpha (0.05), we will reject the null
hypothesis and accept the alternate hypothesis.

Testing the Difference


between Two Means,
Proportions, and Variances
Chap# 9

Question9: Testing the Hypothesis for Difference between Two Means Using T
Test

A sample of 4 cosmetic brands in USA shows their revenues (in billions of dollars) 5 years
ago and their revenues (in billions of dollars) till date today. At a 0.05, can it be concluded
that the average in revenues for the brands is different today than it was 3 years ago? Use a
Confidence interval 0.05.

Brands 1 2 3 4

16
5 years ago 2.5 3.9 1.45 6.2
Till date today 12.1 10.9 11.6 9.8

Assumptions:

 The data are continuous (not discrete).


 The data follow the normal probability distribution.
 The two samples are independent.
 Both samples are simple random samples from their respective populations. Each
individual in the population has an equal probability of being selected in the sample.

Solution:

Ho: μ1 = μ2
Ha: μ1 ≠ μ2

Conclusion:

Since the p-value 0.004 (significant value) is less than alpha (0.05), we will reject the null
hypothesis and accept the alternate hypothesis.

17
Question 10: Confidence Interval for T Distribution of 2 Means

We would like to estimate the sample mean amount of money spent on books by pre-
engineering and medical students in a year is as follows. We have the collected data from 10
randomly selected Pre engineering students: X = $138.7 and S = $56.3 and 6 randomly
selected medical students with X= 168.3 and S = 57.06. Compute with 95% confidence for
two means.

N=6 n=10
100 99
110 120
200 130
240 80
210 98
150 100
  110
  200
  240
  210

X1= 168.3 X2= 138.7


S1= 57.06 S2= 56.3

Assumptions:

 The data are continuous (not discrete).


 The data follow the normal probability distribution.
 The variances of the two populations are equal.
 The two samples are independent.
 There is no relationship between the individuals in one sample as compared to the
other (as there is in the paired t-test).
 Both samples are simple random samples from their respective populations. Each
individual in the population has an equal probability of being selected in the sample.

18
Solution:

Conclusion:

There is significant different between the mean amounts spent on books by pre-engineering
and medical students and the confidence interval lies between 108.44 &228.22.

19
Question 11: Hypothesis Testing & Confidence Interval for Two Population
Proportions

In a 2019, 40% of men in the Iraq were married and 60.3% of women were married. Random
samples of 300 men and 300 women found that 128 men and 129 women were married (not
necessarily to each other.) At the 0.05 level of significance can it be concluded that the
proportion of men who were married is greater than the proportion of women who were
married? C.I 95%

Assumptions:

 The data are continuous (not discrete).


 The data follow the normal probability distribution.
 The two samples are independent.
 Both samples are simple random samples from their respective populations. Each
individual in the population has an equal probability of being selected in the sample.

Solution:

Ho: p1 = p2
Ha: p1 ≠ p2

20
Conclusion:

There is insignificant different between the population proportions of married men as


compared to married women. Also, we will accept the null hypothesis since the p-value
(0.327) is greater than alpha (0.05).

21
Question 12: Hypothesis Testing & Confidence Interval for Two Population
Variances

Windsor Park in UK open a year ago wants to see if the SD and mean of employees
work in the park their daily attendance differs between the summers and winter
months. The data is collected from 5 random days from each season and the data of
each season is showed below. At alpha 0.05 can we conclude a difference in two
sample deviations?

Number of 1 2 3 4 5
days
Winter 110 200 190 420 230
season
Summer 150 350 180 320 140
season

Assumptions:

 The data are continuous (not discrete).


 The data follow the normal probability distribution.
 The two samples are independent.
 Both samples are simple random samples from their respective populations. Each
individual in the population has an equal probability of being selected in the sample.

Solution:

Ho: Ha = σ21 = σ2 2


Ha: = σ21 ≠ σ2 2

22
Conclusion:

There is insignificant different between the population variances between the 2 seasons.
Also, we will accept the null hypothesis since the p-value (0.758) is greater than alpha
(0.05).

23
Correlations and Regression
Chap# 10
Question 13: Regression and Standard Error

The number of faculty and the number of students are shown for a random selection of a
new Inspiration college. Is there a significant relationship between the two variables? Is
there sufficient evidence to conclude a significant relationship between the two variables?

Calculate regression, correlation & standard error of the data mentioned below.

Data A B C D E F G
Faculty 99 119 112 119 140 180 250
Students 1353 1290 1091 1213 1384 1283 2074

Assumptions:

 The two variables can be measured at the continuous level (i.e., they are


either interval or ratio variables).
 There are no significant outliers.  
 There is independence of observations.

Solution:

Ho: p = 0
Ha: p ≠ 0

24
Conclusion:

We will accept the null hypothesis since the p-value is greater than alpha (0.05).

25
Other Chi-Square Tests
Chap# 11
Question 14: Goodness for Fit

A market analyst wished to survey if there was a preference of consumers for any particular
flavor of beverage being sold by a company that commissioned the research. A sample of
100 people provided the data listed below:

Mint- 32
Mango- 28
Orange- 16
Lime- 14
Grape- 10

At alpha 0.10, can it be concluded that there is no preference?

Assumptions:

 The data are obtained from a random sample


 The expected frequency of each category must be at least 5. The data can be
normally distributed.

Solution:

Ho: consumers show no preference for any particular flavor of beverage


Ha: consumers show a preference for any particular flavor of beverage

%
Bevera Observat Preferen
ges ion ces
Mint 32 30
Mango 28 26
Orange 16 23
Lime- 14 15
Grape 10 6
Total 100 100

26
Conclusion:

We will accept the null hypothesis since the p-value is greater than alpha (0.05).

27
Question 15: Test of Independence

Is the choice of car colors dependent upon gender? Recent records from a large automobile
dealer indicated the following preferences. At the 0.05 level of significance, is there a
relationship?

  Red Blue Yellow White Black


Males 22 26 2 20 30
Femal
es 10 10 8 30 22

Assumptions:

 The two variables can be measured at an ordinal or nominal level (i.e., categorical data).


 The two variables should consist of two or more categorical, independent groups.

Solution:

Ho: the choice of car colors is independent of gender


Ha: the choice of car colors is dependent on gender

28
Conclusion:

We will reject the null hypothesis since the p-value is less than alpha (0.05).

29
Analysis of Variances
Chap# 12

Question 16: One Way ANOVA

A researcher wants to analyze if there is a significant difference in number of customers who


visit outlet during traditional occasions. The data is shown with α= 0.05, can it be concluded
that there is a significant difference in average number of customers visiting outlets,

Khadi Men`s Wear Khadi Kids Khadi Women`s Wear


10 1 7
1 12 14
1 1 32
0 9 19
11 1 10
1 11 11
Mean = 4 Mean = 5.8 Mean = 15.5
Variance = 25.6 Variance = Variance = 81.9
29

Assumptions:

 The dependent variable is measured at the interval or ratio level (i.e., they


are continuous).
 Independent variable consists of two or more categorical, independent groups.
 There is independence of observations, which means that there is no relationship
between the observations in each group or between the groups themselves.
 There are no significant outliers.
 The dependent variable is approximately normally distributed for each category of
the independent variable.
 There is homogeneity of variances.

Solution:

Ho: μ1 = μ2= μ3
Ha: at least one of the means is different from the others

30
Conclusion:

We will reject the null hypothesis since the p-value is less than alpha (0.05).

31
Question 17: Two Way ANOVA

A finance researcher wishes to test the effects of two different business strategies and two
different budgeting modules programs in terms of ROI on the IT project in a IT industry.
Three quarters of the yeas are randomly assigned to each group. Analyze the data shown
here, using a two-way ANOVA with a 0.05.

Budgeting
Modules
I II
62 65
A 64 68
Business 66 72
Strategies 58 83
B 62 85
52 91

Assumptions:

 The populations from which the samples were obtained must be normally or
approximately normally distributed.
 The samples must be independent.
 The variances of the populations must be equal.
 The groups must have the same sample size.

Solution:

The null hypotheses for each of the sets are given below.

 The population means of the first factor are equal.


 The population means of the second factor are equal.
 There is no interaction between the two factors.

32
33
Conclusion:

34
We will reject the null hypothesis since the p-value is less than alpha (0.05).

35

Potrebbero piacerti anche