Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
We believe that our process does not produce more than ---- defects per
batch. We plan our r/m requirements and delivery dates based on this
assumption.
We believe that the attrition rate in our company is not more than say 15%.
We plan our recruitment schedule accordingly.
We believe that our collections this month will be around 40% of our out
standings. We plan our cash flow based on this.
Suppose we take a sample and find the sample statistic. If this is far below
or far above our assumption, then we might conclude that our hypothesis
was incorrect. But what if the difference between the observed value of say
sample mean and the assumed population parameter is not much? We
cannot be absolutely certain.
In such cases. We can neither reject nor accept the hypothesis. Instead, the
decision making process has to be objective based on the information
provided by the sample.
We cannot jump to conclusions based on one sample alone. students’ guide
example. We also have to ensure that our decision making is aimed at
situations which are under comparable conditions. ‘Lawn mower example’.
How low is low? To define this, one should understand that our minimum
standard for an acceptable probability is also the risk that we take of
rejecting a good hypothesis/consignment. Depending on various situations,
this value of acceptable levels of probability (α ) can assume different
values. If we want to reduce the probability
of rejecting a true hypothesis, we will go for a low value α. In the above
example if the acceptable level of probability is say 2 %, then we would
have accepted the above consignment.
Null hypothesis is the assumption that we want to test and is denoted by H0.
If we assume that the null hypothesis is correct, then the significance level
will indicate the percentage of sample means that is outside certain limits.
At 5% significance levels, there is a 95% area under the normal curve
where there is no significant difference between x bar and µ and in the
balance 5% area, there is a significant difference between x bar and µ.
Hence if the sample mean falls under the 95% area, we would accept the
null hypothesis while H0 will be rejected if the sample mean were to fall in
the 5% area.
We should understand that when we accept the null hypothesis, we do not
say that the population mean will assume the value of the hypothesized
mean; we only say that there is no statistical evidence to prove that H0 is
wrong. So when sample data does not give us reasons to reject a null
hypothesis, we accept it.
Similarly, a high value for α like say 0.1 will be chosen if one does not
want to commit a Type 2 error i.e., when one does not want to accept the null
hypothesis when it is false. Chemicals in Drugs example.
Hence situations decide the value of α.
‘t’ distribution when n < 30, σ is not known and the parent distribution is’
known to be approximately normal. Also, when n > 5% of N, finite
population correction factor is to be used while calculating the standard
error of the sampling distribution.
There are times when we do not want the sample statistic to take a value
which is lower than the assumed value. In such cases we will formulate the
alternate hypothesis as
H1 : µ < µH0. ex : weight of items packed in a container sold by weight.
Similarly in case we do not want the sample statistic to take a value higher
than the assumed value, the alternate hypothesis will take the form
H1 : µ > µH0. ex : no of defects in a batch
In case we do not want the sample statistic to be neither too high nor too
low as compared to the assumed value, then the alternate hypothesis will
take the form H1 : µ ≠ µH0 ex : diameter of piston in engine assy.
In the first two cases, the significance levels correspond to only one side of
the normal curve about the mean, while in the third case the significance
levels correspond to both the sides of the normal curve about the mean.
Hence in the first case we will reject the null hypothesis only if the sample
statistic is in the left tail of the normal curve ( shaded area ) and in the
second case, we will reject H0 only if the sample statistic is in the right tail
of the normal curve ( shaded area ). These kinds of tests are called one
tailed tests. The 3rd case where H0 will be rejected if the sample statistic
falls in either of the tails ( shaded area ), is called a two tailed test.
We have to decide on the kind of test that we have to perform to get the
desired result based on the situation as referred above.
This has to be decided prior to the sampling process and not after looking
at the sample values as this would lead to erroneous conclusions.
Two Tailed Tests: Given assumed mean is 80000; standard deviation of the
sample is 4000; mean of a sample of size 100 is 79600; α= 0.05
To check whether the assumption is correct.
When we accept a null hypothesis, what we mean is that the sample mean
observed is not significantly far away from the hypothesized value of the
population mean and hence the observed sample could have come from the
population under study. Similarly, rejecting a null hypothesis means that the
observed value of a sample mean is significantly far away from the
hypothesized mean of the population and hence the sample chosen could
not have come from a population with the assumed value of the population
mean.
If we plot the values for 1-β for each value of µ for which H1 is true, we
get a curve called the Power Curve.
We will note that as the actual population mean gets closer to the original
hypothesized mean, the power of the test reduces. In fact when the actual
population mean is exactly equal to the Hypothesised mean, we will find
that 1 – β equals α, which is what we expect. (rejecting a true hypothesis).
So, if the sample is very bad or unsatisfactory, our test is good; but as the
sample is getting better and better, the power of the test reduces and the test
is becoming ‘not so good’ after all! This uncertainty is the cost that we have
to pay for sampling and its associated errors. It is because of these errors
that hypothesis tests do not perform perfectly.
Small samples:
Calculation of standard error is different and‘t’ distribution to be used.
We assume that the two population variances are equal.
Dependent samples:
The choice of the 1st sample had no effect on that of the choice of the 2nd
sample in the previous cases.
We cannot check one set of people for their performance before the test was
taken and another set of people for checking the performance after
undergoing the course. We perform ‘paired difference tests’ in these cases
which is different from the test of the differences of two independent
samples.
Weight loss programme: 10 people tested both before and after going thro’
the programme and their respective weights are as below.
Before 189 202 220 207 194 177 193 202 208 233
After 170 179 203 192 172 161 174 187 186 204
From the table,‘t’ value for 9 deg of freedom and α = 0.1 we find that the
acceptance region is to the left of 1.833
Tests of proportions:
Two production processes under review;
Proportion of defectives p1 = .02; n1= 100; p2 = .025; n2 = 100
Are the processes equally efficient? α = .05
H0 : p1 – p2 = 0
H1 : p1 – p2 ≠ 0
σp1bar – p2bar = √p1q1÷n1 + p2q2 ÷ n2
proceed as before.
Sometimes exact values of ‘p’ value cannot be found out. If σ is not known
in the above example, we would have used a ‘t’ distribution with n-1
degrees of freedom and the ‘t’ table. At times this does not give an exact
value. For ex, µH0 = 50, xbar = 49.2, s = 1.4 and n = 16;
St error = 1.4/ sq rt of 16 = .35.
So, Z xbar = 49.2 – 50 / .35 = -2.286.
From ‘t’ table value lies between α values 0.02 and 0.05.