Statstictics Problems

Statstictics Problems
Group :

Name: Roll no

Anuth Siddharth 127
Abir Banerjee 116
Ninad Tatke 175
Maulik Chandarana 168
Madhurima Chatterjee 159
Tulsi Zaveri 214
Daniel Fernandes 135
Deepika Singh 136

Confidence Interval (Single Population) 4 problems
Problem: 1
A ketchup manufacturer is in the process of deciding whether to promote a
new extra-spicy brand. The companys marketing-research department used
a national telephone survey of 6000 households and found that the extra
spicy ketchup would be purchased by 335 of them. A much more extensive
study made 2 years ago showed that 5 percent of the households would
purchase the brand then. At a 2 percent significance level, should the
company conclude that there is an increased interest in the extra-spicy flavor?
Solution:
N=6000
H0: p=0.05 H1: p>0.05 =0.02
The upper limit of the acceptance region is z=2.05, or
p = pH0 + z((pH0*qH0)/n) = 0.05 + 2.05((0.05*0.95)/6000) = 0.05577
Because the observed z value = (p - pH0)/(pH0qH0/n)
= (0.05583 0.05)/(0.05*0.95/6000)
=2.07
>2.05 (or p>0.05577), we should reject H0. The
current interest is significantly greater than the interest of
2 years ago.

Problem: 2
Steve Cutter sells Big Blade lawn mowers in his hardware store, and he is
interested in comparing the reliability of the mowers he sells with the reliability
of Big Blade mowers sold nationwide. Steve knows that only 15 percent of all
Big Blade mowers sold nationwide require repairs during the first year of
ownership. A sample of 120 of Steves customers revealed that exactly 22 of
them required mower repairs in the first year of ownership. At the 0.02 level of
significance, is there evidence that Steves Big Blade mowers differ in
reliability from those sold nationwide?
Solution:
N=120
p = 22/120 = 0.1833
H0: p = 0.15
H1: p 0.15
= 0.02
The limits of the acceptance region are z = 2.33, or
p = pH0 + z((pH0*qH0)/n) = 0.15 2.33((0.15*0.85)/120)
= (0.0741, 0.2259)
Because the observed z value = (p - pH0)/(pH0qH0/n)
= (0.1833 0.15)/(0.15*0.85/120)
= 1.02
<2.33, we do not reject H0. Steves mowers are
not significantly different in reliability from those sold nationwide.

PROBLEM 3-
In a mobile phone manufacturing company, a random sample of 81 phones is
taken producing a sample mean of 47 and a sample standard variation of
5.89. Construct a 90% confidence interval assuming that the number of
camera phones among normal phones is evenly distributed. Find the interval
width?

Answer-
Here:
s=5.89
x=47
n=81
Formula- x+Z*SD/ n
=47+1.65*5.89/81
(Where 1.65=> in the given table, the value for .45 is 1.65)
=48.07 (upper limit)
Now,
=47-1.65*5.89/81
=45.93 (lower limit)
Answer- We are 90% confident that the number of camera phones among
normal phones will lie between 45.93 and 48.07.

PROBLEM 4-
In a new food home delivery service business, there is a loss of 12 dollars.
Suppose this was resulted from a random sample of 25 households, where
the SD is 21 $, compute a 98% confidence interval on this sample result. How
wide is the interval?
Answer-
Here:
s=21
x=12
n=25
Formula- x+Z*SD/ n
=12+2.33*21/25
=21.78 (upper limit)
Now,
=12-2.33*21/25
=2.22 (lower limit)
Answer- We are 98% confident that the result is within this width.

Hypothesis Testing (Single Population) 4
1. Hypothesis Testing (Single Population)

Problem 1
An insurance company is reviewing its current policy rates. When
originally setting the rates they believed that the average claim amount
was $1,800. They are concerned that the true mean is actually higher
than this, because they could potentially lose a lot of money. They
randomly select 40 claims, and calculate a sample mean of $1,950.
Assuming that the standard deviation of claims is $500, and set =
0.05 test to see if the insurance company should be concerned.
Solution
n = 40
= 1950
500
= 0.05
= 1800

0
1800
1
> 1800

= 1.96 (two tailed hypothesis)

=
1950 1800
50040

= 1.897
Answer
Do not reject
0
as 1.897 falls in the confidence region. We cannot
conclude anything statistically significant from this test, and cannot tell
the insurance company whether or not they should be concerned about
their current policies.
Problem 2
A car manufacturer claimed that their car averaged at least 31 miles
per gallon of gasoline. A sample of 9 cars was selected and each car
was driven with one gallon of regular gasoline. The sample showed a
mean of 29.43 miles with a standard deviation of 3 miles. = 0.05.
What do you conclude about the manufacturers claim?
Solution
n = 9
= 29.43
3
= 0.05
= 31

0
31
1
< 31

= -1.860

=
294331
39

= - 1.57

Answer
We cannot reject
0
. There is insufficient evidence to doubt the
manufacturers claim concerning the gas mileage.

Problem 3
General Electric has developed a new bulb whose design specifications call
for a light output of 960 lumens compared to an earlier model that
produced only 750 lumens. The companys data indicate that the
standard deviation of light output for this type of bulb is 18.4 lumens.
From a sample of 20 new bulbs, the testing committee found an
average light output of 954 lumens per bulb. At a 0.05 significance
level, can General Electric conclude that its new bulb is producing the
specified 960 lumen output?
Solution
= 18.4
n = 20
= 954
= 960
= 0.05
= 960
= 960
< 960
= - 1.65
=
=
= - 1.45
Answer
Do not reject. The new bulb is meeting specifications.

Problem 4
BSNL provides telephone services in Coimbatore. According to the
companys records the average length of calls placed through the
company is 11.44 minutes. The company wants to check if the mean
length of the current calls is different from 11.44 minutes. A sample of
150 such calls placed through this company gave a mean length of
12.71 minutes with a standard deviation of 2.65 minutes. Can you
conclude that the mean length of all current calls is different from 11.44
minutes? Use = 0.05.
Solution
s = 2.65
n = 150
= 12.71
= 11.44
= 0.05
= 11.44
11.44
= 1.65
=
=
= 5.87
Answer
Reject . It is concluded that the mean length of current calls is different from
11.44 minutes.

Confidence Interval (Two Populations) 9
Problem 1: Small Samples
Suppose that simple random samples of college freshman are selected from
two universities - 15 students from school A and 20 students from school B.
On a standardized test, the sample from school A has an average score of
1000 with a standard deviation of 100. The sample from school B has an
average score of 950 with a standard deviation of 90.
What is the 90% confidence interval for the difference in test scores at the two
schools, assuming that test scores came from normal distributions in both
schools? (Hint: Since the sample sizes are small, use a t score as the critical
value.)
(A) 50 + 1.70 (B) 50 + 28.49 (C) 50 + 32.74 (D) 50 + 55.66 (E) None
of the above
Solution
The correct answer is (D). The approach that we used to solve this problem is
valid when the following conditions are met.
The sampling method must be simple random sampling. This
condition is satisfied; the problem statement says that we used simple
random sampling.
The samples must be independent. Since responses from one
sample did not affect responses from the other sample, the samples
are independent.
The sampling distribution should be approximately normally
distributed. The problem states that test scores in each population are
normally distributed, so the difference between test scores will also be
normally distributed.
Since the above requirements are satisfied, we can use the following four-
step approach to construct a confidence interval.
Identify a sample statistic. Since we are trying to estimate the
difference between population means, we choose the difference
between sample means as the sample statistic. Thus, x
1
- x
2
= 1000 -
950 = 50.
Select a confidence level. In this analysis, the confidence level is
defined for us in the problem. We are working with a 90% confidence
level.
Find the margin of error. Elsewhere on this site, we show how to
compute the margin of error when the sampling distribution is
approximately normal. The key steps are shown below.
Find standard deviation. Using the sample standard
deviations, we estimate the standard deviation of the difference
between sample means (SD). SD = sqrt [ s
2
1
/ n
1
+ s
2
2
/ n
2
]
SD = sqrt [(100)
2
/ 15 + (90)
2
/ 20] SD = sqrt (10,000/15 +
8100/20) = sqrt(666.67 + 405) = 32.74
Find critical value. The critical value is a factor used to
compute the margin of error. Because the sample sizes are
small, we express the critical value as a t score rather than a z
score. To find the critical value, we take these steps.
Compute alpha (): = 1 - (confidence level / 100)
= 1 - 90/100 = 0.10
Find the critical probability (p*): p* = 1 - /2 = 1 -
0.10/2 = 0.95
Find the degrees of freedom (df): DF = (s
1
2
/n
1
+
s
2
2
/n
2
)
2
/ { [ (s
1
2
/ n
1
)
2
/ (n
1
- 1) ] + [ (s
2
2
/ n
2
)
2
/ (n
2
- 1) ] }
DF = (100
2
/15 + 90
2
/20)
2
/ { [ (100
2
/15)
2
/ 14 ] + [ (90
2

/20)
2
/ 19 ] } DF = (666.67 + 405}
2
/ (31746.03 +
8632.89) = 1150614.5 / 40378.92 = 28.495 Rounding off
to the nearest whole number, we conclude that there are
28 degrees of freedom.
The critical value is the t score having 28 degrees
of freedom and a cumulative probability equal to 0.95.
From the t Distribution Calculator, we find that the critical
value is 1.7.

Compute margin of error (ME): ME = critical value *
standard deviation = 1.7 * 32.74 = 55.66

Specify the confidence interval. The range of the confidence interval is
defined by the sample statistic + margin of error. And the uncertainty is
denoted by the confidence level.
Therefore, the 90% confidence interval is -5.66 to 100.66. That is, we are 99%
confident that the true difference in population means is in the range defined
by 50 + 55.66.

Problem 2: Large Samples
The local baseball team conducts a study to find the amount spent on
refreshments at the ball park. Over the course of the season they gather
simple random samples of 50 men and 100 women. For men, the average
expenditure was $20, with a standard deviation of $3. For women, it was $15,
with a standard deviation of $2.
What is the 99% confidence interval for the spending difference between men
and women? Assume that the two populations are independent and normally
distributed.
(A) $5 + $0.47 (B) $5 + $1.21 (C) $5 + $2.58 (D) $5 + $5.00 (E)
None of the above
Solution
The correct answer is (B). The approach that we used to solve this problem is
valid when the following conditions are met.
The sampling method must be simple random sampling. This condition
is satisfied; the problem statement says that we used simple random
sampling.
The samples must be independent. Again, the problem statement
satisfies this condition.
The sampling distribution should be approximately normally distributed.
The problem states that test scores in each population are normally
distributed, so the difference between test scores will also be normally
distributed.
Since the above requirements are satisfied, we can use the following four-
step approach to construct a confidence interval.
Identify a sample statistic. Since we are trying to estimate the
difference between population means, we choose the difference
between sample means as the sample statistic. Thus, x
1
- x
2
= $20 -
$15 = $5.
Select a confidence level. In this analysis, the confidence level is
defined for us in the problem. We are working with a 99% confidence
level.
Find the margin of error. Elsewhere on this site, we show how to
compute the margin of error when the sampling distribution is
approximately normal. The key steps are shown below.
Find standard deviation. Since we do not know the
standard deviation of the populations, we use the sample
standard deviations to estimate the standard deviation of the
difference between sample means (SD). SD = sqrt [ s
2
1
/ n
1
+ s
2
2

/ n
2
] SD = sqrt [(3)
2
/ 50 + (2)
2
/ 100] = sqrt (9/50 + 4/100) =
sqrt(0.18 + 0.04) = 0.47
Find critical value. The critical value is a factor used to
compute the margin of error. Because the sample sizes are
large enough, we express the critical value as a z score. To find
the critical value, we take these steps.
Compute alpha (): = 1 - (confidence level / 100)
= 1 - 99/100 = 0.01
Find the critical probability (p*): p* = 1 - /2 = 1 -
0.01/2 = 0.995
The critical value is the z score having a
cumulative probability equal to 0.995. From the Normal
Distribution Calculator, we find that the critical value is
2.58.

Compute margin of error (ME): ME = critical value *
standard deviation = 2.58 * 0.47 = 1.21

Specify the confidence interval. The range of the confidence interval is
defined by the sample statistic + margin of error. And the uncertainty is
denoted by the confidence level.
Therefore, the 99% confidence interval is $3.79 to $6.21. That is, we are 99%
confident that men outspend women at the ballpark by about $5 + $1.21.

Problem: 3
A large hotel chain is trying to decide whether to convert more of its rooms to
non-smoking rooms. In a random sample of 400 guests last year, 166 had
requested non-smoking rooms. This year 205 guests in a sample of 380
preferred the non-smoking rooms. Would you recommend that the hotel chain
convert more rooms to non-smoking? Support your recommendation by
testing the appropriate hypothesis at 0.01 level of significance.
Solution:
n1 = 400
p1 = 0.415
n2 = 380
p2 = 0.5395
H0: p1 = p2
H1: p1<p2
= 0.01
p = (n1 p1 + n2 p2)/(n1+n2) = (400(0.415) + 380(0.5395))/(400+380)
= 0.4757
Sigma p1-p2 = ( p q(1/n1 + 1/n2)) = (0.4757(0.5243)((1/400 + 1/380))) =
0.0358
The lower limit of the acceptance region is z = -2.33, or
p1 - p2 = 0 - z Sigma p1-p2 = -2.33(0.0358) = -0.0834
Because the observed z value = (p1 - p2)/( Sigma p1-p2) = (0.415
0.5395) / 0.0358
= -3.48
< - 2.33, we reject H0. The hotel chain should
convert more rooms to non smoking because there was a
significant increase in the proportion of guests requesting
these rooms over the last year.

Problem:4
Two different areas of a large eastern city are being considered as sites for
day care centres. Of 200 households surveyed in one section, the proportion
in which mothers worked full time was 0.52. In another section, 40 percent of
households surveyed had mothers working at full time jobs. At the 0.04 level
of significance, is there a significant difference in the proportions of working
mothers in the two areas of the city?
Solution:
n1 = 200
p1 = 0.52
n2 = 150
p2 = 0.40
H0: p1 = p2
H1: p1p2
= 0.04
p = (n1 p1 + n2 p2)/(n1+n2) = (200(0.52) + 150(.40))/(200+150)
= 0.4686
Sigma p1-p2 = ( p q(1/n1 + 1/n2)) = (0.4686(0.5314)((1/200 + 1/150))) =
0.0539
The lower limit of the acceptance region is z = 2.05, or
p1 - p2 = 0 - z Sigma p1-p2 = 2.05 (0.0539) = 0.1105
Because the observed z value = (p1 - p2)/( Sigma p1-p2) = (0.52 0.40) /
0.0539
= 2.23
>2.05, we reject H0. The proportions of working
mothers in the two areas differ significantly.

Problem: 5
ABC Airlines wants to find out whether to include more of non vegetarian food
items in its menu. In a random sample of 600 guests last year, 200 had the
inclusion of more of non vegetarian items. This year 300 guests in a sample of
500 preferred non vegetarian items. Help the airline decide whether it should
include more of non veg items or not? Support your recommendation by
testing the appropriate hypothesis at 0.01 level of significance.
Solution:
n1 = 600
p1 = 0.333
n2 = 500
p2 = 0.600
H0: p1 = p2
H1: p1<p2
= 0.01
Ztest= (p1- p2)-(p1-p2)/ p(1- p)(1/n1+1/n2)
where p =x1+x2/n1+n2 = 200+300/600+500= 0.4545
=(0.333-0.600)- (0)/ 0.45(1-0.45)(1/600+1/500)
=-0.267/0.053= -5.03
The lower limit of the acceptance region is z = -2.33
Because the observed z value =-5.03
< - 2.33, we reject H0. The airline should include
more of non veg items

Problem:6
In a school Two different classes are being considered as being ranked
number one. Of the 200 students surveyed in one section, the proportion in
which students obtained full marks was 0.52. In another section, 40 percent of
the students surveyed had students obtaining full marks. At the 0.04 level of
significance, is there a significant difference in the proportions of students
getting full marks?
Solution:
n1 = 200
p1 = 0.52
n2 = 150
p2 = 0.40
H0: p1 = p2
H1: p1p2
= 0.04
p = (n1 p1 + n2 p2)/(n1+n2) = (200(0.52) + 150(.40))/(200+150)
= 0.4686
Sigma p1-p2 = ( p q(1/n1 + 1/n2)) = (0.4686(0.5314)((1/200 + 1/150))) =
0.0539
The lower limit of the acceptance region is z = 2.05, or
p1 - p2 = 0 - z Sigma p1-p2 = 2.05 (0.0539) = 0.1105
Because the observed z value = (p1 - p2)/( Sigma p1-p2) = (0.52 0.40) /
0.0539
= 2.23
>2.05, we reject H0.
The proportions of the students obtaining results differ significantly.

Hypothesis Testing (Two Populations) 9

Problem 1
A machine produced 20 defective articles in a batch of 400. After over
hauling, it produced 10 defectives in a batch of 300. Has the machine
improved?
n
1=
400, n
2 =
300
p
1 =
20 / 400 = 0.05 P
2
=10/300 =0.033

a) Statement of null and alternate hypothesis:
H
0
: P
1
= P
2

H
1
: P
1
> P
2

b) Level of Significance:
Let = 0.05 be the significance. According, H
1
we use one tailed
test.

c) Test Statistic and observed value:
z = p
1
- p
2
_________________
P Q (1/n
1
+ 1/n
2
)

P= 20+100 /700
= 3/70

Q= 67/70

Z
0
= 0.05 - .0.033
____________________________________
3/70. 67/70 (1/400 +1/300)
= 1.103

d) expected value of statictic

z = p
1
- p
2
_________________
P Q (1/n
1
+ 1/n
2
)
has standard normal distribution for n
1
and n
2
/= 30. From normal
distribution table z
e
= 1.645 for = 0.05

e) Decision and Conclusion
Z
0
= 1.103 z
e
=1.645 since z
0
< z
e
accept H
0
and interpret that machine is not
improved due to overhauling

Problem 2
A study was conducted to investigate the effectiveness of hypnotism in
reducing pain. Results for randomly selected subjects are shown in the table.
The "before" value is matched to an "after" value.
TABLE 1
Subject: A B C D E F G H
Before 6.6 6.5 9.0 10.3 11.3 8.1 6.3 11.6
After 6.8 2.4 7.4 8.5 8.1 6.1 3.4 2.0
Are the sensory measurements, on average, lower after hypnotism? Test at a
5% significance level.
Corresponding "before" and "after" values form matched pairs.
TABLE 2
After Data Before Data Difference
6.8 6.6 0.2
2.4 6.5 -4.1
7.4 9 -1.6
8.5 10.3 -1.8
8.1 11.3 -3.2
6.1 8.1 -2
3.4 6.3 -2.9
2 11.6 -9.6
The data for the test are the differences: {0.2, -4.1, -1.6, -1.8, -3.2, -2, -2.9, -
9.6}
The sample mean and sample standard deviation of the differences are: xd=-
3.13 and sd=2.91 Verify these values.
Let d be the population mean for the differences. We use the subscript d to
denote "differences."
Random Variable: Xd = the average difference of the sensory measurements

Ho:d0
(2)
There is no improvement. (dis the population mean of the differences.)

Ha:d<0
(3)
There is improvement. The score should be lower after hypnotism so the
difference ought to be negative to indicate improvement.

Distribution for the test: The distribution is a student-t with df=n1=81=7.
Use t7. (Notice that the test is for a single population mean.)
Calculate the p-value using the Student-t distribution: p-value=0.0095
Xd is the random variable for the differences.
The sample mean and sample standard deviation of the differences are:
xd=-3.13
sd=2.91
Compare and the p-value: =0.05 and p-value=0.0095. >p-value.
Make a decision: Since >p-value, reject Ho.
This means that d<0 and there is improvement.
Conclusion: At a 5% level of significance, from the sample data, there is
sufficient evidence to conclude that the sensory measurements, on average,
are lower after hypnotism. Hypnotism appears to be effective in reducing pain.

PROBLEM 3-
A weight reducing cream manufacturing company wanted to see whether the
usage of the cream is beneficial or not. They are sceptical about the launch of
the same and hence they sampled monthly usage by 6 of its users before and
after using the same, where the significance level is .02, find out the change.
The results are as follows-
EMPLOYEE 1 2 3 4 5 6
MONTH BEFORE
USE
219 205 226 198 209 216
MONTH AFTER
USE
235 186 240 203 221 205

H0: Sigma = 0
H1: Sigma>= 0

D = {Di/n
SD= {(Di-D)*2/n-1
Formula to be used- D- u D/ S/ n
Di= 16, -19, 14, 5 12, -11
{Di= 17
D = 17/6= 2.8

{(Di-D)*2/n-1
=1054.84/5
=14.52095

Here, t test is conducted,
D u D/ (SD/ n)
=2.83/14.52/2.44
=.6177
So, Ho cannot be rejected. Hence we can say that there is no significant
change.

PROBLEM 4-
2 different telecom companies are trying to know the usage of its free calls at
night. The first company sampled 90 people and produced an average of 8.5
hours of relief and a sample SD of 18 hours. The second company sampled
80 people producing an average of 7.9 hours of relief and sample SD of 2.1
hours at .05 level of significance. Does the 2
nd
company have less usage?
Ho= u1-u2=0
H1= u1-u2>0
Here,
N1=90 N2=80
X1=8.5 X2= 7.9
S1=1.8 S2=2.1

Ho= u1-u2=0
H1= u1-u2>0

Z test= (X1 - X2)-(u1-u2)/ s1*2/n1 + s2*2/ n2
8.5-7.9/ (1.8)*2/90+ (2.1)*2 /80
.06/.30
=1.983

Hence here Ho is rejected.

Problem : 5
Within a school district, students were randomly assigned to one of two Math
teachers - Mrs. Smith and Mrs. Jones. After the assignment, Mrs. Smith had
30 students, and Mrs. Jones had 25 students.
At the end of the year, each class took the same standardized test. Mrs.
Smith's students had an average test score of 78, with a standard deviation of
10; and Mrs. Jones' students had an average test score of 85, with a standard
deviation of 15.
Test the hypothesis that Mrs. Smith and Mrs. Jones are equally effective
teachers. Use a 0.10 level of significance. (Assume that student performance
is approximately normal.)
Solution:
The solution to this problem takes four steps: (1) state the hypotheses, (2)
formulate an analysis plan, (3) analyze sample data, and (4) interpret results.
We work through those steps below:

Null hypothesis:
1
-
2
= 0
Alternative hypothesis:
1
-
2
0
For this analysis, the significance level is 0.10. Using sample data, we
will conduct a two-sample t-test of the null hypothesis.
Using sample data, we compute the standard error (SE), degrees of
freedom (DF), and the t-score test statistic (t).
SE = sqrt[(s
1
2
/n
1
) + (s
2
2
/n
2
)]
SE = sqrt[(10
2
/30) + (15
2
/25] = sqrt(3.33 + 9) = sqrt(12.33) = 3.51

DF = (s
1
2
/n
1
+ s
2
2
/n
2
)
2
/ { [ (s
1
2
/ n
1
)
2
/ (n
1
- 1) ] + [ (s
2
2
/ n
2
)
2
/ (n
2
- 1) ] }
DF = (10
2
/30 + 15
2
/25)
2
/ { [ (10
2
/ 30)
2
/ (29) ] + [ (15
2
/ 25)
2
/ (24) ] }
DF = (3.33 + 9)
2
/ { [ (3.33)
2
/ (29) ] + [ (9)
2
/ (24) ] } = 152.03 / (0.382 +
3.375) = 152.03/3.757 = 40.47

t = [ (x
1
- x
2
) - d ] / SE = [ (78 - 85) - 0 ] / 3.51 = -7/3.51 = -1.99
where s
1
is the standard deviation of sample 1, s
2
is the standard
deviation of sample 2, n
1
is the size of sample 1, n
2
is the size of
sample 2, x
1
is the mean of sample 1, x
2
is the mean of sample 2, d is
the hypothesized difference between the population means, and SE is
the standard error.
Since we have a two-tailed test, the P-value is the probability that a t-
score having 40 degrees of freedom is more extreme than -1.99; that
is, less than -1.99 or greater than 1.99.
We use the t Distribution Calculator to find P(t < -1.99) = 0.027, and P(t
> 1.99) = 0.027. Thus, the P-value = 0.027 + 0.027 = 0.054.
Since the P-value (0.054) is less than the significance level (0.10), we cannot
accept the null hypothesis.

Problem 6
In a restaurant there are 2 different departments i.e house keeping and
maintenance. House keeping department had 45 waiters and Maintenance
department had 55 waiters.
At the end of the year, each department took the same standardized test to
measure its performance.House keeping department had an average test
score of 65, with a standard deviation of 10; and Mrs. Jones' students had an
average test score of 75, with a standard deviation of 15.
Test the hypothesis that house keeping department and maintenance
department are equally effective . Use a 0.10 level of significance. (Assume
that waiters performance is approximately normal.)
Solution:
The solution to this problem takes four steps: (1) state the hypotheses, (2)
formulate an analysis plan, (3) analyze sample data, and (4) interpret results.
We work through those steps below:

Null hypothesis:
1
-
2
= 0
Alternative hypothesis:
1
-
2
0
For this analysis, the significance level is 0.10. Using sample data, we
will conduct a two-sample t-test of the null hypothesis.
Using sample data, we compute the standard error (SE), degrees of
freedom (DF), and the t-score test statistic (t).
SE = sqrt[(s
1
2
/n
1
) + (s
2
2
/n
2
)]
SE = sqrt[(10
2
/45) + (15
2
/55] = 2.51
DF = (s
1
2
/n
1
+ s
2
2
/n
2
)
2
/ { [ (s
1
2
/ n
1
)
2
/ (n
1
- 1) ] + [ (s
2
2
/ n
2
)
2
/ (n
2
- 1) ] }
DF = (10
2
/45 + 15
2
/55)
2
/ { [ (10
2
/ 45)
2
/ (44) ] + [ (15
2
/ 55)
2
/ (54) ] }
=8.05
t = [ (x
1
- x
2
) - d ] / SE = [ (65 - 75) - 0 ] / 2.51 = -3.98
where s
1
is the standard deviation of sample 1, s
2
is the standard
deviation of sample 2, n
1
is the size of sample 1, n
2
is the size of
sample 2, x
1
is the mean of sample 1, x
2
is the mean of sample 2, d is
the hypothesized difference between the population means, and SE is
the standard error.
Since we have a two-tailed test, the P-value is the probability that a t-
score having 40 degrees of freedom is more extreme than -3.98; that
is, less than -3.98 or greater than 3.98.
We use the t Distribution Calculator to find P(t < -3.98) = 0.153, and P(t
> 3.98) = 0.153. Thus, the P-value = 0.153 + 0.153 = 0.306.
Since the P-value (0.306) is more than the significance level (0.10), we
accept the null hypothesis.

Problem 7
A credit-insurance organization has developed a new high-tech method of
training new sales personnel. The company sampled 16 employees
who were trained the original way and found average daily sales to be
$688 and the sample standard deviation was $32.63. They also
sampled 11 employees who were trained using the new method and
found average daily sales to be $ 706 and the sample standard
deviation was $24.84. At alpha = 0.05, can the company conclude that
average daily sales have increased under the new plan?
Solution
n1 = 16 n2 = 11 n = n1 + n2 = 27
1 =688 2 =206
1 = 32.63 2 = 24.84
= 0.05
1 - 2 = 0
1 - 2 0
= - 1.708
=
=
= 885.64

=
=
= - 1.545
Answer
Do not reject. Average daily sales have not increased significantly.

Problem 8

Block Enterprises, a manufacturer of chips for computers, is in the
process of deciding whether to replace its current semi automated
assembly line with a fully automated assembly line. Block has gathered
some preliminary test data about hourly chip production, which is
summarized in the following table and it would like to know whether it
should upgrade its assembly line. At = 0.02, state and test the
hypothesis to help Block decide.
n
Semi automatic
Line
198 32 150
Automatic Line 206 29 200

Solution

1 = 198 1 = 32 n1 = 150
2 = 206 2 = 29 n2 = 200

= 0.02

0
1 - 2 = 0
1
1 - 2 0

= - 2.06
=
(12)(1 2)
1
2
1
2
2
2

=
(198206)(0)
32
2
150
29
2
200

= - 2.408
Answer
0
is rejected. Block should upgrade to an automatic line.

One Way Anova 1

Problem:
A quality control supervisor for an automobile manufacturer is concerned with
uniformity in the number of defects in cars coming off the assembly line. If one
assembly line has significantly more variability in the number of defects, then
changes have to be made. The supervisor has collected the following data:
Number of Defects
Assembly Line A Assembly Line B
Mean 10 11
Variance 9 25
Sample Size 20 16
Does Assembly line B have significantly more variability in the number of
defects? Test at the 0.05 significance level.

Solution:
H0: SigmaB = SigmaA
H1: SigmaB > SigmaA
Observed F = SB/SA
= 25/9 = 2.778
Fcrit = F0.05(15,19)
= 2.23

Thus we reject H0; assembly line B does have significantly more variability in
the number of defects, so some changes have to be made.

Chi Sq Test 3

Chi-Square Goodness of Fit Test
Problem
English test grade distributions have changed from last year, with grade B's
somewhat lower. Is this significant?

English test
results
Grade A Grade B Grade C Grade D Grade E
This year, O 23 32 20 15 10
Last year 25 20 15 25 10

Solution:
The given statement is H
0.

The table below shows the calculation. First, the expected values are
created by scaling last year's results to be equivalent to this year. Then the
test statistic is calculated as SUM((O - E)^2/E).

English test
results
Grade A Grade B Grade C Grade D Grade E Sum
This year, O 23 32 20 15 10 100
Last year 25 20 15 25 10 95
Scaled last year, E 26 21 16 26 11 100
(O - E) -3.3 10.9 4.2 -11.3 -0.5
(O - E)^2 11.0 119.8 17.7 128.0 0.3
(O - E)^2/E 0.4 5.7 1.1 4.9 0.0 12.1

Chi-square is found to be 12.1 and the degrees of freedom are (5-1) = 4
(there are five possible grades). Looking this up in the Chi Square table
shows the probability is between 5% (9.49) and 1% (13.28), so H
0
is
adequately falsified and a significant change can be claimed.

Chi-Square Test of Independence
Problem
A year group in school chooses between drama and history as below. Is
there any difference between boys' and girls' choices?

Chose
drama
Chose
history
Boys 43 55
Girls 52 54

Solution:
Observed

Chose
drama
Chose
history Total
Boys 43 55 98
Girls 52 54 106
Total 95 109 204

Expected = (row tot * col tot)/overall tot

Chose
drama
Chose
history Total
Boys 45.6 52.4 98
Girls 49.4 56.6 106
Total 95 109 204

(observed - expected)^2/expected

Chose
drama
Chose
history Total
Boys 0.2 0.1
Girls 0.1 0.1
Total 0.55

Chi-square is 0.55. There are (2-1)*(2-1) = 1 degree of freedom. Checking
the Chi Square table shows 0.55 is between 0.004 and 3.84, so no
conclusion can be drawn about independence or similarity between boys'
and girls' choices.

Chi-Square Test Equality of Proportions
Problem
A wholesale merchant received a shipment of goods which is claimed to be
containing 5% defective items. The merchant decided to verify this. He drew
a sample of 15 items and found 3 defective items. Test the claim. Use =
0.05.

Solution
Given p=the proportion of defectives in the whole shipment
p=p
0
=5/100
=0.05
n=15, p=x/n=3/15
a. Statement of null and alternate hypothesis
H
0
: p=0.05
H
1
: p>0.05
b. Level of significance:
Given = 0.05. So we use right-tailed test.
c. Test statistic and observed value:
x
0
= 3,number of defectives
P = 3/15 = 0.2
d. Expected value of test statistic:
P(X3)=(x=3,15) ()()
, p
0
= 0.05
= 0.0362
e. Decision & Conclusion:
x
0
= 3 lies in rejection region, since 0.0362 is less than 0.05.therefore
we reject H
0
and conclude shipment contains more than 5% defective
items, and hence merchant is advised to reject shipment.

Binomial Dist 1
1. Binomial Distribution

Problem
Harley Davidson, director of quality control for the Kyoto Motor
Company is conducting his monthly spot check of automatic
transmissions. In this procedure, 10 transmissions are removed from
the pool of components and are checked for manufacturing defects.
Historically, only 2 percent of the transmissions have flaws.
a. What is the probability that Harleys sample contains more than two
transmissions with manufacturing flaws?
b. What is the probability that none of the selected transmissions has
any manufacturing flaws?

Solution
n = 10
p = 0.02
1 p = 0.98
Formula P(X = ) =
()

(1 )

P ( = 0) =
10
0(100)
002
0
(098)
100

= 0.8170

P ( = 1) =
10
1(101)
002
1
(098)
101

= 0.16674

P ( = 2) =
10
2(102)
002
2
(098)
102

= 0.0153

P ( > 2) = 1 p ( 2)
= 1 [p ( = 0) + p ( = 1) + p ( = 2)]
= 1 [0.8170 + 0.16674 + 0.0153]
= 1 [0.9991]
= 0.0009

Answer
a. The probability that Harleys sample contains more than two
transmissions with manufacturing flaws is 0.0009.
b. The probability that none of the selected transmissions has any
manufacturing flaws is 0.8170.

Poisson Dist 1
Poisson Distribution

Problem
Southwestern Electronics has developed a new calculator that performs a
series of functions not yet performed by any other calculator. The
marketing department is planning to demonstrate this calculator to a
group of potential customers, but is worried about some initial
problems, which have resulted in 4 percent of the new calculators
developing mathematical inconsistencies. The marketing VP Is
planning on randomly selecting a group of calculators for this
demonstration and is worried about the chances of selecting a
calculator that could start malfunctioning. He believes that whether or
not a calculator functions is a Bernoulli process and he is convinced
that the probability of malfunction is really about 0.04.
Assuming that the VP selects exactly 50 calculators to use in the
demonstration, and using the Poisson distribution as an
approximation of the binomial, what is the chance of getting at
least three calculators that malfunction?
No calculators malfunctioning?
Solution
n = 50
p = 0.04
= np = 2
=0.13533
Formula: P(X=) =
P ( =0) = = 0.13533
P ( =1) = = 0.27066
P ( =2) = = 0.27066
P ( =3) = = 0.18044

P ( 3) = 1 P ( 2)
= 1 [P ( =0) + P ( =1) + P ( =2)]
= 1 [0.13533 + 0.27066 +0.27066]
= 0.32335
Answer
The chance of getting at least three calculators that malfunction is
32.33%
The chance of no calculators malfunctioning is 13.53%

Normal Dis 1
Normal Distribution
Problem
Regulations concerning the maximum number of people who can occupy a lift
are to be set. The total weight of 8 people chosen at random follows a normal
distribution with a mean of 550kg and a standard deviation of 150kg. Whats
the probability that the total weight of 8 people exceeds 600kg?
Solution
The mean is 550kg and we are interested in the area that is greater than
600kg.
z = ( x - m ) / s
Here x = 600kg
m , the mean = 550kg
s, the standard deviation = 150kg
z = ( 600 - 550 ) / 150
z = 50 / 150
z = 0.33

Looking in the table for z = 0.3, and across under 0.03.
The number in the table is the tail area for z=0.33 which is 0.3707.
This is the probability that the weight will exceed 600kg.
Therefore, the probability that the total weight of 8 people exceeds 600kg is
0.37 correct to 2 figures.

Statstictics Problems

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Statstictics Problems

Caricato da

Copyright:

Formati disponibili

Statstictics Problems

= 1.96 (two tailed hypothesis)

Potrebbero piacerti anche