Sei sulla pagina 1di 6

9.

6The Power of a Test

9-1

9.6 The Power of a Test


Section 9.1 defined Type I and Type II errors and their associated risks. Recall that a represents the probability that you reject the null hypothesis when it is true and should not be rejected, and b represents the probability that you do not reject the null hypothesis when it is false and should be rejected. The power of the test, 1 - b, is the probability that you correctly reject a false null hypothesis. This probability depends on how different the actual population parameter is from the value being hypothesized (under H0), the value of a used, and the sample size. If there is a large difference between the population parameter and the hypothesized value, the power of the test will be much greater than if the difference between the population parameter and the hypothesized value is small. Selecting a larger value of a makes it easier to reject H0 and therefore increases the power of a test. Increasing the sample size increases the precision in the estimates and therefore increases the ability to detect differences in the parameters and increases the power of a test. The power of a statistical test can be illustrated by using the Oxford Cereal Company scenario. The filling process is subject to periodic inspection from a representative of the consumer affairs office. The representatives job is to detect the possible short weighting of boxes, which means that cereal boxes having less than the specified 368 grams are sold. Thus, the representative is interested in determining whether there is evidence that the cereal boxes have a mean weight that is less than 368 grams. The null and alternative hypotheses are as follows: H0: m 368 1filling process is working properly2 H1: m 6 368 1filling process is not working properly2

The representative is willing to accept the companys claim that the standard deviation, s, equals 15 grams. Therefore, you can use the Z test. Using Equation (9.1) on page 302, with XL (the lower critical X value) substituted for X, you can find the value of X that enables you to reject the null hypothesis: XL - m s 1n s = XL - m 1n Z =

Z a>2

The decision rule for this one-tail test is

Because you have a one-tail test with a level of significance of 0.05, the value of Za>2 is equal to - 1.645 (see Figure 9.15). The sample size n = 25. Therefore, 1152 XL = 368 + 1 - 1.6452 = 368 - 4.935 = 363.065 125 Reject H0 if X 6 363.065; otherwise, do not reject H0.

XL = m + Za>2

s 1n

F I G ur E 9 . 1 5 Determining the lower critical value for a one-tail Z test for a population mean at the 0.05 level of significance

.05 XL Region of Rejection ZL = 1.645

.95

= 368
Region of Nonrejection 0

9-2

CHAPTER 9 Fundamentals of Hypothesis Testing: One-Sample Tests

The decision rule states that if in a random sample of 25 boxes, the sample mean is less than 363.065 grams, you reject the null hypothesis, and the representative concludes that the process is not working properly. The power of the test measures the probability of concluding that the process is not working properly for differing values of the true population mean. What is the power of the test if the actual population mean is 360 grams? To determine the chance of rejecting the null hypothesis when the population mean is 360 grams, you need to determine the area under the normal curve below XL = 363.065 grams. Using Equation (9.1), with the population mean m = 360, ZSTAT = X - m s 1n 363.065 - 360 = = 1.02 15 125

From Table E.2, there is an 84.61% chance that the Z value is less than + 1.02. This is the power of the test where m is the actual population mean (see Figure 9.16). The probability 1b2 that you will not reject the null hypothesis 1m = 3682 is 1 - 0.8461 = 0.1539. Thus, the probability of committing a Type II error is 15.39%.
F I G ur E 9 . 1 6 Determining the power of the test and the probability of a Type II error when m = 360 grams

Power = .8461

= 360

.1539 XL = 363.065 X

+1.02

Now that you have determined the power of the test if the population mean were equal to 360, you can calculate the power for any other value of m. For example, what is the power of the test if the population mean is 352 grams? Assuming the same standard deviation, sample size, and level of significance, the decision rule is Reject H0 if X 6 363.065 otherwise, do not reject H0. Once again, because you are testing a hypothesis for a mean, from Equation (9.1), ZSTAT = X - m s 1n

If the population mean shifts down to 352 grams (see Figure 9.17), then ZSTAT = 363.065 - 352 = 3.69 15 125

9.6The Power of a Test

9-3

F I G ur E 9 . 1 7 Determining the power of the test and the probability of a Type II error when m = 352 grams

= .00011 Power = .99989

= 352 0

XL = 363.065 X
+3.69 Z

From Table E.2, there is a 99.989% chance that the Z value is less than + 3.69. This is the power of the test when the population mean is 352. The probability 1b2 that you will not reject the null hypothesis 1m = 3682 is 1 - 0.99989 = 0.00011. Thus, the probability of committing a Type II error is only 0.011%. In the preceding two examples, the power of the test is high, and the chance of committing a Type II error is low. In the next example, you compute the power of the test when the population mean is equal to 367 gramsa value that is very close to the hypothesized mean of 368 grams. Once again, from Equation (9.1), ZSTAT = X - m s 1n

If the population mean is equal to 367 grams (see Figure 9.18), then ZSTAT = 363.065 - 367 = - 1.31 15 125

F I G ur E 9 . 1 8 Determining the power of the test and the probability of a Type II error when m = 367 grams

Power = .0951

= .9049 XL = 363.065 1.31 = 367 0 X Z

From Table E.2, the probability less than Z = - 1.31 is 0.0951 (or 9.51%). Because the rejection region is in the lower tail of the distribution, the power of the test is 9.51%, and the chance of making a Type II error is 90.49%. Figure 9.19 illustrates the power of the test for various possible values of m (including the three values examined). This graph is called a power curve.

9-4

CHAPTER 9 Fundamentals of Hypothesis Testing: One-Sample Tests


1.00 0.90 0.80 0.70 0.60 Power 0.50 0.40 0.30 0.20 0.10 0.00 .5080 .3783 .2578 .1635 .0951 .0500 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 Possible Values for (grams) .99961 .99989 .9964 .9783

F I G ur E 9 . 1 9 Power curve of the cereal-box-filling process for H1: m 6 368 grams

.99874

.9909

.9545 .9131 .8461 .7549 .6406

For situations involving one-tail tests in which the actual mean, m1, exceeds the hypothesized mean, the converse would be true. The larger the actual mean, m1, compared with the hypothesized mean, the greater is the power. For twotail tests, the greater the distance between the actual mean, m1, and the hypothesized mean, the greater the power of the test.
1

From Figure 9.19, you can see that the power of this one-tail test increases sharply (and approaches 100%) as the population mean takes on values farther below the hypothesized mean of 368 grams. Clearly, for this one-tail test, the smaller the actual mean m, the greater the power to detect this difference.1 For values of m close to 368 grams, the power is small because the test cannot effectively detect small differences between the actual population mean and the hypothesized value of 368 grams. When the population mean approaches 368 grams, the power of the test approaches a, the level of significance (which is 0.05 in this example). Figure 9.20 summarizes the computations for the three cases. You can see the drastic changes in the power of the test for different values of the actual population means by reviewing the different panels of Figure 9.20. From Panels A and B you can see that when the population mean does not greatly differ from 368 grams, the chance of rejecting the null hypothesis, based on the decision rule involved, is not large. However, when the population mean shifts substantially below the hypothesized 368 grams, the power of the test greatly increases, approaching its maximum value of 1 (or 100%). In the above discussion, a one-tail test with a = 0.05 and n = 25 was used. The type of statistical test (one-tail vs. two-tail), the level of significance, and the sample size all affect the power. Three basic conclusions regarding the power of the test are summarized below: 1. A one-tail test is more powerful than a two-tail test. 2. An increase in the level of significance 1a2 results in an increase in power. A decrease in a results in a decrease in power. 3. An increase in the sample size, n, results in an increase in power. A decrease in the sample size, n, results in a decrease in power.

9.6The Power of a Test


Region of Rejection Panel A Given: = .05, = 15, n = 25 One-tail test = 368 (null hypothesis is true) XL = 368 (1.645) 15 = 363.065 25 Decision rule: Reject H0 if X < 363.065; otherwise do not reject Panel B Given: = .05, = 15, n = 25 One-tail test H0: = 368 = 367 (true mean shifts to 367 grams) ZSTAT = 363.065 367 X = 1.31 = 3 n

9-5

F I G ur E 9 . 2 0 Determining statistical power for varying values of the population mean

Region of Nonrejection XL = 363.065

1 = .95

= .050

368

Power = .0951

= .9049

Power = .0951 Panel C Given: = .05, = 15, n = 25 One-tail test H0: = 368 = 360 (true mean shifts to 360 grams) ZSTAT = X 367 X

363.065 360 = +1.02 3

Power = .8461

= .1539

Power = .8461 Panel D Given: = .05, = 15, n = 25 One-tail test H0: = 368 = 352 (true mean shifts to 352 grams) ZSTAT = X 363.065 352 = = +3.69 3 n Power = .99989 360 X

= .00011

Power = .99989 352 Region of Rejection XL = 363.065 X

Region of Nonrejection

9-6

CHAPTER 9 Fundamentals of Hypothesis Testing: One-Sample Tests

Problems for Section 9.6


AppLYInG tHE COncEpts 9.80 A coin-operated soft-drink machine is designed to discharge at least 7 ounces of beverage per cup, with a standard deviation of 0.2 ounce. If you select a random sample of 16 cups and you are willing to have an a = 0.05 risk of committing a Type I error, compute the power of the test and the probability of a Type II error 1b2 if the population mean amount dispensed is actually a. 6.9 ounces per cup. b. 6.8 ounces per cup.
9.81 Refer to Problem 9.80. If you are willing to have an a = 0.01 risk of committing a Type I error, compute the power of the test and the probability of a Type II error 1b2 if the population mean amount dispensed is actually a. 6.9 ounces per cup. b. 6.8 ounces per cup. c. Compare the results in (a) and (b) of this problem and in Problem 9.80. What conclusion can you reach? 9.82 Refer to Problem 9.80. If you select a random sample of 25 cups and are willing to have an a = 0.05 risk of committing a Type I error, compute the power of the test and the probability of a Type II error 1b2 if the population mean amount dispensed is actually a. 6.9 ounces per cup. b. 6.8 ounces per cup. c. Compare the results in (a) and (b) of this problem and in Problem 9.80. What conclusion can you reach? 9.83 A tire manufacturer produces tires that have a mean life of at least 25,000 miles when the production process is working properly. Based on past experience, the standard deviation of the tires is 3,500 miles. The operations manager stops the production process if there is evidence that the mean tire life is below 25,000 miles. If you select a random sample of 100 tires (to be subjected to destructive testing) and you are willing to have an a = 0.05 risk of committing a Type I error, compute the power of the test and the probability of a Type II error 1b2 if the population mean life is actually a. 24,000 miles. b. 24,900 miles. 9.84 Refer to Problem 9.83. If you are willing to have an a = 0.01 risk of committing a Type I error, compute the power of the test and the probability of a Type II error 1b2 if the population mean life is actually a. 24,000 miles. b. 24,900 miles. c. Compare the results in (a) and (b) of this problem and (a) and (b) in Problem 9.83. What conclusion can you reach? 9.85 Refer to Problem 9.83. If you select a random sample of 25 tires and are willing to have an a = 0.05 risk of committing a Type I error, compute the power of the test and the probability of a Type II error 1b2 if the population mean life is actually a. 24,000 miles. b. 24,900 miles. c. Compare the results in (a) and (b) of this problem and (a) and (b) in Problem 9.83. What conclusion can you reach? 9.86 Refer to Problem 9.83. If the operations manager stops the process when there is evidence that the mean life is different from 25,000 miles (either less than or greater than) and a random sample of 100 tires is selected, along with a level of significance of a = 0.05, compute the power of the test and the probability of a Type II error 1b2 if the population mean life is actually a. 24,000 miles. b. 24,900 miles. c. Compare the results in (a) and (b) of this problem and (a) and (b) in Problem 9.83. What conclusion can you reach?

Potrebbero piacerti anche