Sei sulla pagina 1di 43

1

Assignment 1 (Stat-702)
Testing of Hypothesis and Confidence Interval for Population Mean

Q # 2: A random sample of 30 wheat forms from Faisalabad district showed the mean wheat
production of 50 Kg per acre. Can we conclude that the mean production of wheat from
Faisalabad district is greater than 45 Kg per acre, if standard deviation is 3 Kg. Use 5% level
of significance?
Solution:
One-Sample T
Descriptive Statistics
5% Lower Bound
N Mean StDev SE Mean for μ
30 50.00 3.000 0.548 50.931
0
μ: mean of Sample
Test
Null hypothesis H₀: μ = 45
Alternative hypothesis H₁: μ > 45
T-
Value P-Value
9.13 0.000

Q # 3: A researcher wishes to estimate the average amount of money that a student from
university spends for food per day. A random sample of 36 students is selected and the
sample mean is found to be Rs 45 with standard deviation of Rs.3. Estimate 90 %
confidence limits for the average amount of money that the students from the university
spend on food per day.
Solution:
One-Sample T
Descriptive Statistics
SE
N Mean StDev Mean 90% CI for μ
3 45.000 3.000 0.500 (44.155, 45.845)
6
μ: mean of Sample

1
2

Q # 4:- The following data represents the daily milk production of a random sample of 10
cows from a particular breed 12,15,11,13,16,19,15,16,18,15. Construct 95% C.I for the
average milk production of all the cows of that particular breed and test the hypothesis that
average yield is less than 15.

Solution:
One-Sample T: x
Descriptive Statistics
SE 95% Upper Bound
N Mean StDev Mean for μ
1 15.000 2.494 0.789 16.446
0
μ: mean of x
Test
Null hypothesis H₀: μ = 15
Alternative hypothesis H₁: μ < 15
T-
Value P-Value
0.00 0.500

Q # 5: For each of the following a random sample of size n is taken from a


normal distribution with mean µ and variance ϭ2 . The sample mean is X. Test
the hypothesis stated at the level of significance indicated.

Sample n X H0 Level of Significance


H1
1 30 15.2 3.0 µ = 15.8 µ ≠15.8 5%

2 10 27.0 1.2 µ ≤ 26.3 µ > 26.3 5%

Sample n X (x i X ) H 0 Level of Significance


2
H1
3 65 100 842.4 µ ≤ 99.2 µ > 99.2 5%

4 80 85.3 2508.8 µ ≥ 86.2 µ < 86.2 10%

5 100 6.85 36 µ = 7.0 µ ≠ 7.o 1%

Solution:
Part 1

2
3

One-Sample T
Descriptive Statistics
SE
N Mean StDev Mean 5% CI for μ
3 15.200 3.000 0.548 (15.165, 15.235)
0
μ: mean of Sample
Test
Null hypothesis H₀: μ = 15.8
Alternative hypothesis H₁: μ ≠ 15.8
T-
Value P-Value
-1.10 0.282
Part 2
One-Sample T
Descriptive Statistics
5% Lower Bound
N Mean StDev SE Mean for μ
10 27.00 1.200 0.379 27.696
0
μ: mean of Sample
Test
Null hypothesis H₀: μ = 26.3
Alternative hypothesis H₁: μ > 26.3
T-
Value P-Value
1.84 0.049
Part 3
One-Sample T
Descriptive Statistics
5% Lower Bound
N Mean StDev SE Mean for μ
6 188 842 104 362
5
μ: mean of Sample
Test
Null hypothesis H₀: μ = 99.2
Alternative hypothesis H₁: μ > 99.2
T-
Value P-Value
0.85 0.199
Part 4
One-Sample T

3
4

Descriptive Statistics
10% Upper Bound
N Mean StDev SE Mean for μ
8 85 2509 280 -277
0
μ: mean of Sample
Test
Null hypothesis H₀: μ = 86.2
Alternative hypothesis H₁: μ < 86.2
T-
Value P-Value
-0.00 0.499
Part 5
One-Sample T
Descriptive Statistics
1% CI for
N Mean StDev SE Mean μ
10 6.85 36.00 3.60 (6.80, 6.90)
0
μ: mean of Sample
Test
Null hypothesis H₀: μ = 7
Alternative hypothesis H₁: μ ≠ 7
T-
Value P-Value
-0.04 0.967

Q # 6. Two independent samples of 100 mechanists and 100 carpenters are taken to estimate
the difference between the weekly wages of the two categories of workers. The relevant data
are given below

Sample Mean wages Population Varience


Mechanists 345 196
Carpenters 340 204

Determine the 95% and the 99% confidence limits for the true difference between the
average wages for machinists and carpenters.

Two-Sample T-Test and CI


Method

4
5

μ₁: mean of Sample 1


µ₂: mean of Sample 2
Difference: μ₁ - µ₂
Equal variances are assumed for this analysis.
Descriptive Statistics
Sample N Mean StDev SE Mean
Sample 1 100 345.0 14.0 1.4
Sample 2 100 340.0 14.3 1.4
Estimation for Difference
Pooled 99% CI for
Difference StDev Difference
5.00 14.14 (-0.20, 10.20)

Test
Null hypothesis H₀: μ₁ - µ₂ = 0
Alternative hypothesis H₁: μ₁ - µ₂ ≠ 0
t- value 2.50

p-value 0.01

5
6

Q # 7. The following data represents the number of hours of relief provided by 4 different
brands
(A, B, C, D) of headache tablets administrated to 24 subjects experiencing fevers of 38o C
or more.
Solution:
ONE WAY ANOVA ABCD method
Null hypothesis All means are equal
Alternative hypothesis Not all means are equal
Significance level α = 0.05
Equal variances were assumed for the analysis.
Factor Information
Factor Levels Values
Factor 4 A, B, C, D
Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
Factor 3 13.928 4.6428 23.75 0.000
Error 20 3.910 0.1955
Total 23 17.838
Model Summary
S R-sq R-sq(adj) R-sq(pred)
0.442154 78.08% 74.79% 68.44%
Means
Factor N Mean StDev 95% CI
A 6 3.850 0.394 (3.473, 4.227)
B 6 5.083 0.621 (4.707, 5.460)
C 6 3.567 0.301 (3.190, 3.943)
D 6 5.333 0.388 (4.957, 5.710)
Pooled StDev = 0.442154

Assignment No 2

6
7

Q. 1:- Take first 50 odd numbers (for girls) and first 50 even numbers (for boys) and then add
the last two digits of your registration number to it and named this variable as X. Calculate all
the descriptive statistics X variable. (Show the data as well as the results)
odd no x
1 57
3 59
5 61
7 63
9 65
11 67
13 69
15 71
17 73
19 75
21 77
23 79
25 81
27 83
29 85
31 87
33 89
35 91
37 93
39 95
41 97
43 99
45 101
47 103
49 105
51 107
53 109
55 111
57 113
59 115
61 117
63 119
65 121
67 123
69 125
71 127
73 129
75 131
77 133
79 135
81 137
83 139
85 141
87 143
89 145
91 147
93 149
95 151

7
8

97 153
99 155
Solution:
Descriptive Statistics: x
Statistics
Variabl Minimu Media
e N N* Mean SE Mean StDev m Q1 n Q3 Maximum

x 5 0 106.00 4.12 29.15 57.00 80.50 106.00 131.50 155.00


0

Q. 2:- Generate a random sample of 50 observations from a Normal Population using your
computers and named it Y, having
Mean = 56
Sd =6
(a) Show the histogram, Box Plot, Dot plot, Stem and Leaf, 5 Number Summary, Mean,
Variance, Standard Deviation, Standard Error AND other possible descriptive Statistics.
(b) Take the Log of Y variable and calculate all the above numbers for Log Y.

Solution:
y y log
55.0394 1.74067
62.5018 1.79589
54.4514 1.73601
40.6429 1.60898
59.3960 1.77376
45.7666 1.66055
56.7525 1.75398
61.9404 1.79197
64.0976 1.80684
56.0339 1.74845
41.7293 1.62044
55.3017 1.74274
51.5425 1.71217
52.1177 1.71699
47.6767 1.67831
57.3193 1.75830
56.4509 1.75167
57.1236 1.75682
59.7637 1.77644
65.6931 1.81752
56.8746 1.75492
46.7253 1.66955
56.3147 1.75062
47.9736 1.68100
57.5440 1.76000
54.1692 1.73375
56.4612 1.75175
50.9375 1.70704

8
9

52.8178 1.72278
54.6995 1.73798
56.4138 1.75138
53.6204 1.72933
65.0174 1.81303
43.0942 1.63442
57.9027 1.76270
57.4801 1.75952
55.5304 1.74453
52.7995 1.72263
46.1610 1.66428
58.9324 1.77035
45.1734 1.65488
58.3802 1.76627
51.2394 1.70960
64.9608 1.81265
53.3196 1.72689
35.4775 1.54995
53.4248 1.72774
51.1432 1.70879
49.7835 1.69709
49.5308 1.69488

Boxplot of y

65

60

55
y

50

45

40

35

9
10

Dotplot of y

35 40 45 50 55 60 65
y

Histogram of y
20

15
Frequency

10

0
36 40 44 48 52 56 60 64
y

Stem-and-Leaf Display: y
Stem-and-leaf of y   N = 50
1 3 5
1 3  
1 3  
3 4 01

10
11

4 4 3
6 4 55
10 4 6677
12 4 99
16 5 0111
22 5 222333
(6 5 444555
)
22 5 666666677777
10 5 8899
6 6 1
5 6 2
4 6 4455
Leaf Unit = 1

Descriptive Statistics: y
Statistics
Variabl SE Minimu
e N N* Mean Mean StDev m Q1 Median Q3 Maximum

y 5 0 53.905 0.911 6.443 35.478 50.649 54.869 57.49 65.693


0 6

Boxplot of y log
1.85

1.80

1.75
y log

1.70

1.65

1.60

1.55

11
12

Histogram of y log
18

16

14

12
Frequency

10

0
1.56 1.62 1.68 1.74 1.80
y log

Dotplot of y log

1.56 1.60 1.64 1.68 1.72 1.76 1.80


y log

12
13

Stem-and-Leaf Display: y log


Stem-and-leaf of y log   N = 50
1 15 4
1 15  
1 15  
2 16 0
4 16 23
5 16 5
9 16 6667
1 16 899
2
1 17 00011
7
2 17 22222333
5
2 17 44445555555555
5
1 17 66777
1
6 17 99
4 18 0111
Leaf Unit = 0.01

Descriptive Statistics: y log


Statistics
Variabl Minimu Media
e N N* Mean SE Mean StDev m Q1 n Q3 Maximum

y log 5 0 1.7284 0.00774 0.054 1.5500 1.7045 1.7393 1.759 1.8175


0 7 6

Q. 3:- Draw a Scatter Plot of Y (in Q.2) and X (in Q.1). Fit a Simple Linear Regression
Equation and provide fitted equation, Inference about β0 and β1 using t-test. Also provide
ANOVA and R2. Store the Fits 𝑌 ̂ in your datasheet and verify that ∑(𝑌 𝑖 −𝑌 𝑖 ̂) = 0 by using
the Calculator in Minitab. Also Store the Residuals and verify that the sum of stored residuals
is ZERO. Also take the descriptive Statistics and make the Histogram of Residuals and
comment on its distribution.

Solution:

13
14

Scatterplot of odd no vs x
100

80

60
odd no

40

20

50 75 100 125 150


x

Scatterplot of y vs y log
70

60

50
y

40

30
1.55 1.60 1.65 1.70 1.75 1.80 1.85
y log

14
15

Scatterplot of y vs x

65

60

55
y

50

45

40

35

50 75 100 125 150


x

Regression Analysis: y versus x


Analysis of Variance
D Adj
Source F Adj SS MS F-Value P-Value

Regression 1 49.96 49.96 1.21 0.277


 x 1 49.96 49.96 1.21 0.277
Error 48 1984.36 41.34    
Total 49 2034.32      
Model Summary
R-
S R-sq sq(adj) R-sq(pred)

6.4296 2.46% 0.42% 0.00%


9
Coefficients
T-
Term Coef SE Coef Value P-Value VIF

Constan 57.58 3.46 16.64 0.000  


t
x -0.0346 0.0315 -1.10 0.277 1.00
Regression Equation
y = 57.58 - 0.0346 x
Fits and Diagnostics for Unusual Observations

15
16

Ob Std
s y Fit Resid Resid

4 40.64 55.3 -14.75 -2.37 R


9
11 41.73 54.9 -13.18 -2.09 R
1
46 35.48 52.4 -17.01 -2.73 R
8
R  Large residual

Descriptive Statistics: RESI


Statistics
Variabl StDe
e N N* Mean SE Mean v Minimum Q1 Median Q3 Maximum

RESI 5 0 - 0.900 6.364 -17.007 - 0.887 3.72 12.337


0 0.000 2.706 1

Descriptive Statistics: FITS, RESI


Statistics
Variabl SE Minimu
e N N* Mean Mean StDev m Q1 Median Q3 Maximum

FITS 5 0 53.905 0.143 1.010 52.208 53.022 53.905 54.78 55.602


0 8
RESI 5 0 -0.000 0.900 6.364 -17.007 -2.706 0.887 3.721 12.337
0

Histogram of RESI
16

14

12

10
Frequency

0
-16 -12 -8 -4 0 4 8 12
RESI

16
17

Histogram of FITS
6

4
Frequency

0
52.8 53.6 54.4 55.2
FITS

17
18

ASSIGNMENT NO 3 Regression N Correlation


Q1. It is well known that some form of advertising for a particular product will be
associated with and have an effect on its sales.
Company A B C D E F G H I J

Sales in 25 35 29 24 37 11 18 26 16 29
(000) Rs
Advertising 8 12 11 5 14 3 6 8 4 9
cost in (000)
Rs

Scatterplot of sales y vs adv x


40

35

30
sales y

25

20

15

10
2 4 6 8 10 12 14
adv x

Regression Analysis: sales y versus adv x


Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
Regression 1 543.112 543.11 71.36 0.000
2
  adv x 1 543.112 543.11 71.36 0.000
2
Error 8 60.888 7.611    

18
19

  Lack-of- 7 60.388 8.627 17.25 0.183


Fit
  Pure Error 1 0.500 0.500    
Total 9 604.000      
Model Summary
S R-sq R-sq(adj) R-sq(pred)
2.7588 89.92% 88.66% 84.03%
0
Coefficients
Term Coef SE Coef T-Value P-Value VIF
Constan 7.69 2.23 3.45 0.009  
t
adv x 2.164 0.256 8.45 0.000 1.00
Regression Equation
sales y = 7.69 + 2.164 adv x

Sales for an advertisining cost of rs 800


Y=7.96+2.164(800)=1739
We estimate a sale of Rs 1739 when the advertising cost is 800
Test
Null hypothesis H₀: μ = 0
Alternative H₁: μ ≠ 0
hypothesis
Sampl T-
e Value P-Value

sales y 9.65 0.000


adv x 7.05 0.000
Conclusion
We reject h0 because our p value is less than t value.

Q2. The management of a chain of fast food restaurants wants to investigate the
relationship between the daily sales volumes of a company.
Competitors 1 1 2 3 3 6 5 7
Sales (00$) 36 33 31 29 26 22 20 19

1. Draw scatter plot


2. Develop linear regression model
3. Regression model if
a. Competitors are 9
b. Competitors are 4

19
20

4. Test significance of regression model


5. Test hypothesis β1 = 0

Scatterplot of y sales vs comp x


36

32

28
y sales

24

20

1 2 3 4 5 6 7
comp x

Regression Analysis: y sales versus comp x


The regression equation is
y sales = 36.24 - 2.639 comp x
Model Summary
S R-sq R-sq(adj)
2.0536 90.83% 89.30%
8
Analysis of Variance
Source DF SS MS F P
Regressio 1 250.694 250.694 59.4 0.000
n 4
Error 6 25.306 4.218    
Total 7 276.000      
One-Sample T: y sales, comp x
Descriptive Statistics
Sampl
e N Mean StDev SE Mean 95% CI for μ
y sales 8 27.00 6.28 2.22 (21.75, 32.25)
comp x 8 3.500 2.268 0.802 (1.604, 5.396)
μ: mean of y sales, comp x

20
21

Q3: Following is the data of insects per hill (X) and the corresponding yield (Y) of 9
different varieties of rice.
Y 45 47 55 43 38 40 35 48 54
X 24 23 16 25 29 24 28 22 16

1. Scatter plot
2. Regression model
3. Analysis test

Scatterplot of y vs x
55

50

45
y

40

35

15.0 17.5 20.0 22.5 25.0 27.5 30.0


x

Regression Analysis: y versus x


Analysis of Variance
Adj
Source DF SS Adj MS F-Value P-Value
Regression 1 338.37 338.367 70.43 0.000
 x 1 338.37 338.367 70.43 0.000
Error 7 33.63 4.805    
  Lack-of- 5 20.63 4.127 0.63 0.705
Fit
  Pure Error 2 13.00 6.500    
Total 8 372.00      

21
22

Model Summary
S R-sq R-sq(adj) R-sq(pred)
2.1919 90.96% 89.67% 86.17%
5
Coefficients
Term Coef SE Coef T-Value P-Value VIF
Constan 77.84 3.98 19.55 0.000  
t
x -1.428 0.170 -8.39 0.000 1.00
Regression Equation
y = 77.84 - 1.428 x
One-Sample T: y, x
Descriptive Statistics
Sampl
e N Mean StDev SE Mean 95% CI for μ
y 9 45.00 6.82 2.27 (39.76, 50.24)
x 9 23.00 4.56 1.52 (19.50, 26.50)
μ: mean of y, x

Regression Equation
Yeild = 77.84 - 1.428 Insects Per Hill

Y^=77.84+1.428(36)=129
It shows b1=1.428, it means if there is a one unit increase in X there is one unit change in Y

Model Summary
S R-sq R-sq(adj) R-sq(pred)
2.1919 90.96% 89.67% 86.17%
5
The 90% of variation is explained by variable X and remaining 10% is due to the other
factors

After the analysis of data we can say that if there is an increase in the number of insects per
acre it reduces the yield of land

Q4: The following data represent the final marks in the statistics course for a
random sample of 12 students of MS Finance class along with their class test marks
as well as the number of classes missed during the winter semester 2017
1. Appropriate regression model
2. Test the significance of regression model
3. Test significance of Partial regression coefficien

22
23

Y X1 X2

85 65 2

70 50 10

76 55 6
90 65 2

85 55 6

87 70 3

94 65 2

98 70 5

81 55 4

92 70 3

76 45 1

74 55 4

Regression Analysis: final marks y versus test scores x1, ... es missed x2
Analysis of Variance
D F-
Source F Adj SS Adj MS Value P-Value
Regression 2 629.82 314.91 13.48 0.002
  test scores x1 1 467.69 467.69 20.03 0.002
  classes missed x2 1 26.04 26.04 1.11 0.319
Error 9 210.18 23.35    
  Lack-of-Fit 4 92.01 23.00 0.97 0.496
  Pure Error 5 118.17 23.63    
Total 11 840.00      
Model Summary
S R-sq R-sq(adj) R-sq(pred)
4.8325 74.98% 69.42% 54.79%
3
Coefficients

23
24

Term Coef SE Coef T-Value P-Value VIF


Constant 38.2 11.9 3.20 0.011  
test scores x1 0.807 0.180 4.48 0.002 1.12
classes missed x2 -0.654 0.619 -1.06 0.319 1.12
Regression Equation
final marks = 38.2 + 0.807 test scores x1 - 0.654 classes missed x2
y
Two-Sample T-Test and CI: test scores x1, classes missed x2
Method
μ₁: mean of test scores x1
µ₂: mean of classes missed x2
Difference: μ₁ - µ₂
Equal variances are not assumed for this analysis.
Descriptive Statistics
Sample N Mean StDev SE Mean
test scores x1 12 60.00 8.53 2.5
classes missed x2 12 4.00 2.49 0.72

Estimation for Difference


Differenc 95% CI for
e Difference
56.00 (50.41, 61.59)
Test
Null hypothesis H₀: μ₁ - µ₂ = 0
Alternative hypothesis H₁: μ₁ - µ₂ ≠ 0
T-Value DF P-Value
21.84 12 0.000

Q 6: The time X in years that an employee spent at a company and the employees
hourly pay Y for 8 employees are listed in the table below. Calculated and interpret the
correlation coefficient r and interpret the result. Also test the significance of the
population Correlation Coefficient alpha at 5% level of significance.
Y X

25 6

20 4

21 5

35 10

24
25

38 15

28 12

34 11

39 9

Correlations
Pearson 0.784
correlation
P-value 0.021
One-Sample T: y, x
Descriptive Statistics
Sampl
e N Mean StDev SE Mean 95% CI for μ
y 8 30.00 7.52 2.66 (23.71, 36.29)
x 8 9.00 3.78 1.34 (5.84, 12.16)
μ: mean of y, x
Test
Null hypothesis H₀: μ = 5
Alternative hypothesis H₁: μ ≠ 5
Sampl
e T-Value P-Value
y 9.40 0.000
x 2.99 0.020

25
26

ASSIGNMENT NO 4
Q # 1: It is hypothesized that the average diameter of the leaves of a tree is 20.4 mm. To
check this supposition we selected a random sample of 16 leaves and found that = 22 X mm.
the population standard deviation (ϭ) is known and is 2mm. Also
construct a 95% confidence interval and verify your decision. One-Sample Z Descriptive
Statistics

N Mean SE Mean 95% CI for μ


16 22.000 0.500 (21.020, 22.980)
μ: mean of Sample
Known standard deviation = 2
Test
H₀: μ =
Null hypothesis 20.4
H₁: μ ≠
Alternative hypothesis 20.4
Z-Value P-Value
3.20 0.001

Q # 2: A researcher wishes to estimate the average amount of money that a student from
university spends for food per day. A random sample of 36 students is selected and the
sample mean is found to be Rs 45 with standard deviation of Rs.3. Estimate 90 % confidence
limits for the average amount of money that the students from the university spend on food
per day. One-Sample T
Descriptive Statistics

26
27

N Mean StDev SE Mean 90% CI for μ


36 45.000 3.000 0.500 (44.155, 45.845)
μ: mean of Sample
Q # 3:- A cattle rancher has changed the type of feed he uses to fatten his cattle for sale. The
feed company claims that the new feed will increase the mean weight gain in his cattle by at
least 100 pounds per steer. Assuming the weight gain of cattle is normally distributed, test
the hypothesis of the feed company at α = 0.05. Previously, the mean
weight gain per steer has been 800 pounds. A random sample of 30 yields a mean
weight of = 935 x pounds with a standard deviation of 85 puns.

One-Sample T
Descriptive Statistics
N Mean StDev SE Mean 95% CI for μ
30 935.0 85.0 15.5 (903.3, 966.7)
μ: mean of Sam

27
28

Q # 4. A test in statistics was given to 50 girls and 75 boys. The girls made an average grade of
76 with a standard deviation of 6, while the boys made an average grade of 82 with a standard
deviation of 8. Find a 95 % confidence interval for the difference U1 – U2, where U1 is the mean
score of all boys and U2 is the mean score of all girls who might take this test.
Two-Sample T-Test and CI
Method
μ₁: mean of Sample 1
µ₂: mean of Sample 2
Difference: μ₁ - µ₂
Equal variances are not assumed for this analysis.
Descriptive Statistics
Sample N Mean StDev SE Mean
Sample 1 50 76.00 6.00 0.85
Sample 2 75 82.00 8.00 0.92
Estimation for Difference
95% CI for
Difference Difference
-6.00 (-8.48, -3.52)
Test
Null hypothesis H₀: μ₁ - µ₂ = 0
Alternative hypothesis H₁: μ₁ - µ₂ ≠ 0
T-Value DF P-Value
-4.78 121 0.000

Q # 5. Two independent samples of 100 mechanists and 100 carpenters are taken to estimate
the difference between the weekly wages of the two categories of workers. The relevant data
are given below.
Sample Mean wages Population Variance Mechanists 345 196 Carpenters 340
204 Two-Sample T-Test and CI
Determine the 95% and the 99% confidence limits for the true difference between the
average wages for machinists and carpenters.

Two-Sample T-Test and CI


Method
μ₁: mean of Sample 1
µ₂: mean of Sample 2
Difference: μ₁ - µ₂
Equal variances are not assumed for this analy

28
29

Descriptive Statistics
SE
Sample N Mean StDev Mean
Sample 1 100 345 196 20
Sample 2 100 340 204 20
Estimation for Difference
95% CI for
Difference Difference
5.0 (-50.8, 60.8)
Test
Null hypothesis H₀: μ₁ - µ₂ = 0
Alternative hypothesis H₁: μ₁ - µ₂ ≠ 0
T-Value DF P-Value
0.18 197 0.860

Q # 6. A random sample of 200 villages was taken from Faisalabad District and average
population per village was found to be 498 with a standard deviation of 50. Another sample
of 200 villages from the same district gave an average population 510 per village with a
standard deviation of 40. Is the difference between the averages of the two samples
statistically significant?
Two-Sample T-Test and CI
Method
μ₁: mean of Sample 1
µ₂: mean of Sample 2
Difference: μ₁ - µ₂
Equal variances are not assumed for this analysis.
Descriptive Statistics
Sample N Mean StDev SE Mean
Sample 1 200 498.0 50.0 3.5
Sample 2 200 510.0 40.0 2.8
Estimation for Difference
95% CI for
Difference Difference
-12.00 (-20.90, -3.10)
Test
Null hypothesis H₀: μ₁ - µ₂ = 0
Alternative hypothesis H₁: μ₁ - µ₂ ≠ 0

29
30

T-Value DF P-Value
-2.65 379 0.008

Q # 7. An examination was given to two classes of 40 and 50 students respectively. In the


first class, mean grade was 74 with a standard deviation of 8, while in the second class the
mean grade was 78 with a standard deviation of 7. Is there a significance difference between
the mean grades at 1 % level?
Two-Sample T-Test and CI
Method
μ₁: mean of Sample 1
µ₂: mean of Sample 2
Difference: μ₁ - µ₂
Equal variances are not assumed for this analysis.
Descriptive Statistics
Sample N Mean StDev SE Mean
Sample 1 40 74.00 8.00 1.3
Sample 2 50 78.00 7.00 0.99
Estimation for Difference
99% CI for
Difference Difference
-4.00 (-8.24, 0.24)
Test
Null hypothesis H₀: μ₁ - µ₂ = 0
Alternative hypothesis H₁: μ₁ - µ₂ ≠ 0
T-Value DF P-Value
-2.49 78 0.015

Q #8:- Following are the protein contents measured in two types of


species Species 1 (X1) 0.72 1.12 0.81 0.89 0.72 0.81 1.01 0.75 0.83
Species 2 (X2) 1.21 0.93 0.80 1.12 1.22 0.94 0.87

Construct 95% Confidence interval for difference between means (Assume population
variances are equal).Descriptive Statistics: Species 01, Species )2 Statistics

Variable N N* Mean SE Mean StDev Variance Median


Species 01 9 0 0.8511 0.0453 0.1358 0.0184 0.8100
Species )2 7 2 1.0129 0.0638 0.1689 0.0285 0.9400

30
31

Histogram of Species 01

Histogram of Species 01
2.0

1.5
Frequency

1.0

0.5

0.0
0.7 0.8 0.9 1.0 1.1
Species 01

Histogram of Species 2

Histogram of Species )2
3.0

2.5

2.0
Frequency

1.5

1.0

0.5

0.0
0.8 0.9 1.0 1.1 1.2
Species )2

31
32

Q # 9:- A random sample of 20 plants from Variety I showed a mean height of 63 cm with
standard deviation of 6 cm, while another random sample of 25 plants from Varity II showed
a mean height of 60 cm with standard deviation of 2 cm. Construct 90 confidence interval
for the difference between two variety means. (Assume population variances are unequal)
Two-Sample T-Test and CI
Method
μ₁: mean of Sample 1
µ₂: mean of Sample 2
Difference: μ₁ - µ₂
Equal variances are not assumed for this analysis.
Descriptive Statistics
Sample N Mean StDev SE Mean
Sample 1 20 63.00 6.00 1.3
Sample 2 25 60.00 2.00 0.40
Estimation for Difference
90% CI for
Difference Difference
3.00 (0.60, 5.40)
Test
Null hypothesis H₀: μ₁ - µ₂ = 0
Alternative hypothesis H₁: μ₁ - µ₂ ≠ 0
T-Value DF P-Value
2.14 22 0.043

Q # 10:- An experiment was performed with seven hop plants. One half of each plant was
pollinated and the other half was not pollinated. The yield of seed of each hop plant is given
below
Plant No: 1 2 3 4 5 6 7
Pollinated: 0.78 0.76 0.43 0.92 0.86 0.59 0.68
Non-Pollinated: 0.21 0.12 0.32 0.29 0.30 .20 0.14

Construct 90 % confidence interval for difference between mean seed for pollinated and
non- pollinated half. Descriptive Statistics: Pollinated, Non-Polinated Statistics

Variable N N* Mean SE Mean StDev Variance Median


Pollinated 7 0 0.7171 0.0631 0.1670 0.0279 0.7600
Non-Pollinated 7 0 0.2257 0.0301 0.0796 0.0063 0.2100

32
33

Histogram of Pollinated

Histogram of Pollinated
2.0

1.5
Frequency

1.0

0.5

0.0
0.4 0.5 0.6 0.7 0.8 0.9
Pollinated

Histogram of Non-Polinated
Histogram of Non-Polinated
3.0

2.5

2.0
Frequency

1.5

1.0

0.5

0.0
0.10 0.15 0.20 0.25 0.30
Non-Polinated

33
34

Q # 11:- A random sample of size n1 = 10, selected from a normal population has a mean
11 ˆ X = 20 and a standard deviation s = 5. A second random sample of size n2 = 12,
selected from a different normal population has a mean 22 ˆ X = 24 and a standard
deviation s = 6. If U1 = 22 and U2 = 19 and σ12 and σ22 are known but approximately
equal, test whether there is any reason to doubt that U1 - U2 = 3.
Two-Sample T-Test and CI
Method
μ₁: mean of Sample 1
µ₂: mean of Sample 2
Difference: μ₁ - µ₂
Equal variances are not assumed for this analysis.
Descriptive Statistics
Sample N Mean StDev SE Mean
Sample 1 10 20.00 5.00 1.6
Sample 2 12 24.00 6.00 1.7
Estimation for Difference
95% CI for
Difference Difference
-4.00 (-8.91, 0.91)
Test
Null hypothesis H₀: μ₁ - µ₂ = 3
Alternative hypothesis H₁: μ₁ - µ₂ ≠ 3
T-Value DF P-Value
-2.98 19 0.008

Q # 12:- Following are the heights of plants of two varieties of wheat and the interest is to
test whether the two populations have the same height? 22 12 assuming .
V-1 87 91 89 88 89 91 87 92 93 98 95 97 96 100 101 V-2 82 84 85 83 90 89 94 90 96 91 100
92 97 105 10
Descriptive Statistics: V-1, V-2
Statistics
Variable N N* Mean SE Mean StDev Variance Median
V-1 9 0 89.667 0.726 2.179 4.750 89.000
V-2 9 0 88.11 1.65 4.94 24.36 89.00

34
35

Histogram of V-1
2.0

1.5
Frequency

1.0

0.5

0.0
87 88 89 90 91 92 93
V-1

Histogram of V-2
Histogram of V-2
Frequency

3.0

35
36

2.5

2.0

1.5

1.0

0.5

0.0
82 84 86 88 90 92 94 96
V-2

36
37

Q # 13:- Ten recruits were put through a physical training programme by the army. Their
weights were recorded before and after the training as: 127 and 135, 126 and 200, 162 and
160, 170 and 182, 143 and 147, 205 and 200, 168 and 172, 175 and 186, 197 and 193, 136
and 141. Using α = 0.05, should we conclude that the training programme affects the
average weight of young recruits.
Descriptive Statistics: Before, After
Statistics
Variable N N* Mean SE Mean StDev Variance Median
Before 9 0 163.67 9.28 27.84 775.00 168.00
After 9 0 175.00 7.78 23.35 545.25 182.00

Histogram of Before

Histogram of Before
2.0

1.5
Frequency

1.0

0.5

0.0
130 140 150 160 170 180 190 200
Before

37
38

Histogram of After
2.0

1.5
Frequency

1.0

0.5

0.0
140 150 160 170 180 190 200
After

Q # 14:- The time required by 10 persons to perform a task in seconds before and
after receiving a mild stimulant are given in the accompanying table.
Before 34 45 31 43 40 41 33 29 41 37 after 29 42 32 29 36 42 26 28 38 33
Test the hypothesis that there is no difference between the mean times in the before and after
populations. As an alternative, assumed that the after population will have a lower mean.
Using 5 % level of significance.
Descriptive Statistics: Before 1, After 2
Statistics
Variable N N* Mean SE Mean StDev Variance Median
Before 1 9 0 37.44 1.92 5.75 33.03 40.00
After 2 9 0 33.56 2.04 6.13 37.53 32.00

38
39

Histogram of Before 1
2.0

1.5
Frequency

1.0

0.5

0.0
32 36 40 44
Before 1

Histogram of After 2

Histogram of After 2
2.0

1.5
Frequency

1.0

0.5

0.0
25.0 27.5 30.0 32.5 35.0 37.5 40.0 42.5
After 2

39
40

Q # 15. The following data represents the number of hours of relief provided by 4 different
brands (A, B, C, D) of headache tablets administrated to 24 subjects experiencing fevers of
38o C or more.
One-way ANOVA: A, B, C, D
Method
Null hypothesis All means are equal
Alternative hypothesis Not all means are equal
Significance level α = 0.05
Equal variances were assumed for the analysis.
Factor Information
Factor Levels Values
Factor 4 A, B, C, D
Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
Factor 3 13.928 4.6428 23.75 0.000
Error 20 3.910 0.1955
Total 23 17.838
Model Summary
S R-sq R-sq(adj) R-sq(pred)
0.442154 78.08% 74.79% 68.44%
Means
Factor N Mean StDev 95% CI
A 6 3.850 0.394 (3.473, 4.227)
B 6 5.083 0.621 (4.707, 5.460)
C 6 3.567 0.301 (3.190, 3.943)
D 6 5.333 0.388 (4.957, 5.710)
Pooled StDev = 0.442154

40
41

Interval Plot of A, B, C

Interval Plot of A, B, ...


95% CI for the Mean
6.0

5.5

5.0
Data

4.5

4.0

3.5

3.0
A B C D

The pooled standard deviation is used to calculate the intervals.

Q # 16. In an experiment to compare the effect of four drugs A, B, C and D on the


lymphocyte counts in mice a randomized block design with four mice from each of five
litters was used, the litter being regarded as blocks. The lymphocyte counts (thousand per
mm3 of blood) are given in the table below:
One-way ANOVA: 1, 2, 3, 4, 5
Method
Null hypothesis All means are equal
Alternative hypothesis Not all means are equal
Significance level α = 0.05
Rows unused 1
Equal variances were assumed for the analysis.
Factor Information
Factor Levels Values
Factor 5 1, 2, 3, 4, 5
Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
Factor 4 5.100 1.2751 7.35 0.002
Error 14 2.429 0.1735
Total 18 7.529

41
42

Model Summary
S R-sq R-sq(adj) R-sq(pred)
0.416548 67.74% 58.52% 41.98%
Means
Factor N Mean StDev 95% CI
1 3 6.833 0.231 (6.318, 7.349)
2 4 5.600 0.440 (5.153, 6.047)
3 4 6.175 0.525 (5.728, 6.622)
4 4 5.225 0.263 (4.778, 5.672)
5 4 5.925 0.486 (5.478, 6.372)
Pooled StDev = 0.416548
Interval Plot of 1, 2, 3

Interval Plot of 1, 2, ...


95% CI for the Mean
7.5

7.0

6.5
Data

6.0

5.5

5.0

1 2 3 4 5

The pooled standard deviation is used to calculate the intervals.

42
43

43