Sei sulla pagina 1di 9

Chapter 8

Estimation
 Estimation is a procedure by which a numerical value or values are assigned to a
population parameter based on the information collected from a sample.
 Estimation The assignment of value(s) to a population parameter based on a value of the
corresponding sample statistic is called estimation.

Estimate and Estimator

 The value(s) assigned to a population parameter based on the value of a sample statistic is
called an estimate. The sample statistic used to estimate a population parameter is called
an estimator.

The estimation procedure involves the following steps.


1. Select a sample.
2. Collect the required information from the members of the sample.
3. Calculate the value of the sample statistic.
4. Assign value(s) to the corresponding population parameter.

Point and Interval Estimate

Point Estimate
The value of a sample statistic that is used to estimate a population parameter is called a point
estimate.

Thus, the value computed for the sample mean 𝑥̅ , from a sample is a point estimate of the
corresponding population mean, 𝜇 .

Interval Estimation
In interval estimation, an interval is constructed around the point estimate, and it is stated that this
interval is likely to contain the corresponding population parameter.

Confidence Level and Confidence Interval


Each interval is constructed with regard to a given confidence level and is called a confidence
interval. The confidence interval is given as
Point estimate ± Margin of error

The confidence level associated with a confidence interval states how much confidence we have
that this interval contains the true population parameter. The confidence level is denoted by

(1−∝)100%.

1
Although any value of the confidence level can be chosen to construct a confidence interval, the
more common values are 90%, 95%, and 99%. The corresponding confidence coefficients are .90,
.95, and .99, respectively.

Estimation of Population Mean: 𝝈 known

Here, there are three possible cases, as follows.

Case I. If the following three conditions are fulfilled:


1. The population standard deviation 𝜎 is known
2. The sample size is small (i.e.,𝑛 < 30)
3. The population from which the sample is selected is normally distributed,
then we use the normal distribution to make the confidence interval for 𝜇 .

Case II. If the following two conditions are fulfilled:


1. The population standard deviation 𝜎 is known
2. The sample size is large (i.e., ,𝑛 ≥ 30)
then, again, we use the normal distribution to make the confidence interval for 𝜇 .

Case III. If the following three conditions are fulfilled:


1. The population standard deviation 𝜎 is known
2. The sample size is small (i.e., 𝑛 < 30)
3. The population from which the sample is selected is not normally distributed (or its
distribution is unknown),
then we use a nonparametric method to make the confidence interval for 𝜇.

The following chart summarizes the above three cases.

𝜎 Is Known

Case I Case II Case III


1. n < 30 𝒏 ≥ 𝟑𝟎 1. n < 30
2. Population is normal 2. Population is not normal

Use the normal distribution Use a nonparametric method


to estimate 𝜇 to estimate 𝜇

2
Confidence Interval for 𝝁

The (1−∝)100% confidence interval for 𝜇 under Cases I and II is

𝑥̅ ± 𝑧𝜎𝑥̅
𝜎
where 𝜎𝑥̅ = √𝑛
The value of z used here is obtained from the standard normal distribution table.

Margin of Error The margin of error for the estimate for , denoted by E, is the quantity that is
subtracted from and added to the value of 𝑥̅ to obtain a confidence interval for 𝜇. Thus,
𝐸 = 𝑧𝜎𝑥̅

Table: z Values for Commonly Used Confidence Levels

Confidence Level Areas to Look for in Table IV z Value


90% .0500 and .9500 1.64
95% .0250 and .9750 1.96
99% .0050 and .9950 2.58

Problem
A publishing company has just published a new college textbook. Before the company decides the
price at which to sell this textbook, it wants to know the average price of all such textbooks in the
market. The research department at the company took a sample of 25 comparable textbooks and
collected information on their prices. This information produced a mean price of $145 for this
sample. It is known that the standard deviation of the prices of all such textbooks is $35 and the
population of such prices is normal.

(a) What is the point estimate of the mean price of all such college textbooks?
(b) Construct a 90% confidence interval for the mean price of all such college textbooks.

Solution
Here, 𝜎 is known and, although n < 30, the population is normally distributed.
Hence, we can use the normal distribution to find confidence interval for 𝜇. From the given
information,

n = 25, 𝑥̅ = $145, and 𝜎 = $35

The standard deviation of 𝑥̅ is


𝜎 35
𝜎𝑥̅̅ = 𝑛 = = $7.00
√ √25

(a) The point estimate of the mean price of all such college textbooks is $145; that is,

3
Point estimate of 𝜇 = 𝑥̅ = $145

(b) The confidence level is 90%, or .90. First we find the z value for a 90% confidence level.
The value is z =1.65 (from Table).
Next, we substitute all the values in the confidence interval formula for 𝜇. The 90% confidence
interval for 𝜇 is
𝑥̅ ± 𝑧𝜎𝑥̅ = 145 ± 1.65(7.0) = 145 ± 11.55
= (145 − 11.55)𝑡𝑜 (145 + 11.55) = $133.45 to $156.55
Thus, we are 90% confident that the mean price of all such college textbooks is between $133.45
and $156.55.

Problem: A sample of 1500 homes sold recently in a state gave the mean price of homes equal to
$299,720. The population standard deviation of the prices of homes in this state is $68,650.
Construct a 99% confidence interval for the mean price of all homes in this state.

Solution:
From the given information,

𝑛 = 1500, 𝑥̅ = $299,720 𝑎𝑛𝑑 𝜎 = $68650


In this example, we know the population standard deviation. Although the shape of the population
distribution is unknown, the population standard deviation is known, and the sample size is large
(n>30). Hence, we can use the normal distribution to make a confidence interval for 𝜇.

The standard deviation of 𝑥̅ is


𝜎 68650
𝜎𝑥̅̅ = = = $1773.9
√𝑛 √1500

The z-value for 99% confidence level is 2.58.

The 99% confidence interval for 𝜇 is


𝑥̅ ± 𝑧𝜎𝑥̅ = 299720 ± 2.58(1773.9) = 299720 ± 4576.7
= (299720 − 4576.7)𝑡𝑜 (299720 + 4576.7) = $295143.3 to $304296.7

Determining the Sample Size for the Estimation of 𝝁


Given the confidence level and the standard deviation of the population, the sample size that will
produce a predetermined margin of error E of the confidence interval estimate of 𝜇 is
𝑧 2𝜎2
𝑛=
𝐸2
If we do not know 𝜎, we can take a preliminary sample (of any arbitrarily determined size) and
find the sample standard deviation, s. Then we can use s for 𝜎 in the formula. However, note that
using s for 𝜎 may give a sample size that eventually may produce an error much larger (or smaller)
than the predetermined margin of error. This will depend on how close s and are.

Problem

4
An alumni association wants to estimate the mean debt of this year’s college graduates. It is known
that the population standard deviation of the debts of this year’s college graduates is $11,800. How
large a sample should be selected so that the estimate with a 99% confidence level is within $800
of the population mean?

Solution
The alumni association wants the 99% confidence interval for the mean debt of
this year’s college graduates to be

𝑥̅ ± 800
Hence, the maximum size of the margin of error of estimate is to be $800; that is,

The value of z for a 99% confidence level is 2.58. The value of 𝜎 is given to be $11,800.

Therefore, substituting all values in the formula and simplifying, we obtain


𝑧 2𝜎2 2.582 118002
𝑛= = = 1448.18 ≈ 1448
𝐸2 8002

Thus, the required sample size is 1449.

Estimation of Population Mean: 𝝈 not known

Here, there are three possible cases, as follows.

Case I. If the following three conditions are fulfilled:


1. The population standard deviation 𝜎 is not known
2. The sample size is small (i.e.,𝑛 < 30)
3. The population from which the sample is selected is normally distributed,
then we use the t- distribution to make the confidence interval for 𝜇 .

Case II. If the following two conditions are fulfilled:


1. The population standard deviation 𝜎 is not known
2. The sample size is large (i.e., ,𝑛 ≥ 30)
then, again, we use the t- distribution to make the confidence interval for 𝜇 .

Case III. If the following three conditions are fulfilled:


1. The population standard deviation 𝜎 is not known
2. The sample size is small (i.e., 𝑛 < 30)
3. The population from which the sample is selected is not normally distributed (or its
distribution is unknown), then we use a nonparametric method to make the confidence interval for
𝜇.

5
The following chart summarizes the above three cases.

𝜎 Is not Known

Case I Case II Case III


1. n < 30 𝒏 ≥ 𝟑𝟎 1. n < 30
2. Population is normal 2. Population is not normal

Use the t- distribution Use a nonparametric method


to estimate 𝜇 to estimate 𝜇

Confidence Interval for 𝝁 The (1−∝)100% confidence interval for 𝜇 under Cases I and II

is 𝑥̅ ± 𝑡𝑠𝑥̅
𝑠
where 𝑠𝑥̅ = √𝑛
The value of z used here is obtained from the standard normal distribution table.

Margin of Error The margin of error for the estimate for , denoted by E, is the quantity that is
subtracted from and added to the value of 𝑥̅ to obtain a confidence interval for 𝜇. Thus,
𝐸 = 𝑡𝑠𝑥̅

The t Distribution
The t distribution is a specific type of bell-shaped distribution with a lower height and a wider
spread than the standard normal distribution. As the sample size becomes larger, the t distribution
approaches the standard normal distribution. The t distribution has only one parameter, called the
degrees of freedom (df ). The mean of the t distribution is equal to 0, and its standard deviation is
(df)/(df –2).

EXAMPLE
Dr. Moore wanted to estimate the mean cholesterol level for all adult men living in Hartford. He
took a sample of 25 adult men from Hartford and found that the mean cholesterol level for this
sample is 186 mg/dL with a standard deviation of 12 mg/dL. Assume that the cholesterol levels
for all adult men in Hartford are (approximately) normally distributed. Construct a 95% confidence
interval for the population mean𝜇.

6
Solution
Here, 𝜎 is not known, n <30, and the population is normally distributed. Therefore, we will use
the t distribution to make a confidence interval for 𝜇. From the given information,

n =25, 𝑥̅ = 186, s =12,


and Confidence level =95%, or .95

The value of 𝑠𝑥̅ is


𝑠 12
𝑠𝑥̅ = 𝑛 = = 2.40
√ √25
To find the value of t, we need to know the degrees of freedom and the area under the t distribution
curve in each tail.
Degrees of freedom =n −1 =25 − 1 =24
From the t distribution table, Table V of Appendix C, the value of t for df = 24 and .025 area in the
right tail is 2.064.

When we substitute all values in the formula for the confidence interval for 𝜇, the 95% confidence
interval is

𝑥̅ ± 𝑡𝑠𝑥̅ = 186 ± 2.064(2.40) = 186 ± 4.95 = 181.05 𝑡𝑜 190.95

Thus, we can state with 95% confidence that the mean cholesterol level for all adult men
living in Hartford is between 181.05 and 190.95 mg/dL.

EXAMPLE
Sixty-four randomly selected adults who buy books for general reading were asked how much they
usually spend on books per year. The sample produced a mean of $1450 and a standard deviation
of $300 for such annual expenses. Determine a 99% confidence interval for the corresponding
population mean.
Solution
From the given information,

n =64, 𝑥̅ = 1450, s =300,

and Confidence level = 99%, or .99

Here is not known, but the sample size is large (n >30). Hence, we will use the t distribution to
make a confidence interval for 𝜇. First we calculate the standard deviation of 𝑥̅ , the number of
degrees of freedom, and the area in each tail of the t distribution:

𝑠 300
𝑠𝑥̅ = = = 37.50
√𝑛 √64
df= n−1=64 − 1 = 63

From the t distribution table, the value of t for df = 63 and .005 area in the right tail is 2.656.

7
When we substitute all values in the formula for the confidence interval for 𝜇, the 95% confidence
interval is

𝑥̅ ± 𝑡𝑠𝑥̅ = 1450 ± 37.5(2.402.656) = 1450 ± 99.60 = 𝟏𝟑𝟓𝟎. 𝟒𝟎 𝐭𝐨 𝟏𝟓𝟒𝟗. 𝟔𝟎

Thus, we can state with 99% confidence that based on this sample the mean annual expenditure
on books by all adults who buy books for general reading is between $1350.40 and $1549.60.

Estimation of a Population Proportion: Large Samples


Often we want to estimate the population proportion or percentage. (Recall that a percentage is
obtained by multiplying the proportion by 100.) For example, the production manager of a
company may want to estimate the proportion of defective items produced on a machine. A bank
manager may want to find the percentage of customers who are satisfied with the service provided
by the bank.

Recall that the population proportion is denoted by p, and the sample proportion is denoted by 𝑝̂ .
The sample proportion 𝑝̂ , is a sample statistic, and it possesses a sampling distribution. We know
that for large samples:

1. The sampling distribution of the sample proportion 𝑝̂ , is (approximately) normal.


2. The mean 𝜇𝑝̂ , of the sampling distribution of 𝑝̂ is equal to the population proportion, p.
3. The standard deviation 𝜎𝑝̂ , of the sampling distribution of the sample proportion 𝑝̂ , is √𝑝𝑞/𝑛
where 𝑞 = 1 − 𝑝.

Note:
In the case of a proportion, a sample is considered to be large if np and nq are both greater than 5.
If p and q are not known, then 𝑛𝑝̂ and 𝑛𝑞̂ should each be greater than 5 for the sample to be large.

Estimator of the Standard Deviation of 𝑝̂ ,


The value of 𝑠𝑝̂̂ which gives a point estimate of 𝜎𝑝̂ is calculated as follows. Here, 𝑠𝑝̂̂ is an
estimator of 𝜎𝑝̂

𝑝̂𝑞̂
𝑠𝑝̂ = √
𝑛

Confidence Interval for the Population Proportion, p

The (1 − 𝛼)100% confidence interval for the population proportion, p, is

𝑝̂ ± 𝑧𝑠𝑝̂

The value of z used here is obtained from the standard normal distribution table for the given

8
𝑝̂𝑞̂
confidence level 𝛼, 𝑠𝑝̂ = √ 𝑛 . The term 𝑧𝑠𝑝̂ is called the margin of error, E.

Problem:
According to a survey conducted by Pew Research Center in June 2009, 44% of people aged 18
to 29 years said that religion is very important to them. Suppose this result is based on a sample of
1000 people aged 18 to 29 years.

(a) What is the point estimate of the corresponding population proportion?


(b) Find, with a 99% confidence level, the percentage of all people aged 18 to 29 years who
will say that religion is very important to them. What is the margin of error of this estimate?

Solution:
Let p be the proportion of all people aged 18 to 29 years who will say that religion is very important
to them, and let 𝑝̂ be the corresponding sample proportion. From the
given information,

n =1000, pˆ= .44, qˆ= 1- pˆ= 1 - .44 = .56

First, we calculate the value of the standard deviation of the sample proportion as follows:

𝑝̂𝑞̂ .44×.56
𝑠𝑝̂ = √ 𝑛 = √ =.01569713
1000
Note that 𝑛𝑝̂ and 𝑛𝑞̂ are both greater than 5. Consequently, the sampling distribution of is
approximately normal, and we will use the normal distribution to make a confidence interval
about p.

(a) The point estimate of the proportion of all people aged 18 to 29 years who will say that
religion is very important to them is equal to .44; that is,
Point estimate of p =𝑝̂ =.44

(b) The confidence level is 99%, or .99. To find z for a 99% confidence level, first we find the
1−.99
area in each of the two tails of the normal distribution curve, which is 2 = .005. Then,
we look for .0050 and .0050 + .99 =.9950 areas in the normal distribution table to find the
two values of z. These two z values are (approximately) -2.58 and 2.58. Thus, we will use
z = 2.58 in the confidence interval formula. Substituting all the values in the confidence
interval formula for p, we obtain

𝑝̂ ± 𝑧𝑠𝑝̂ = .44 ± 2.58(. 01569713) = .44 ± .04 = .40 𝑡𝑜 .48 𝑜𝑟 40% 𝑡𝑜 48%

Thus, we can state with 99% confidence that .40 to .48, or 40% to 48%, of all people aged 18 to
29 years will say that religion is very important to them.

(c) The margin of error associated with this estimate of p is .04 or 4%.

Potrebbero piacerti anche