Sei sulla pagina 1di 28

STAT355 - Probability & Statistics Chapter 7: Statistical Intervals Based on a Single Sample

Fall 2011

STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 1 / 28 Inte

Chapter 7 - Statistical Intervals Based on a Single Sample

7.1 Basic Properties of Condence Intervals 7.2 Large-Sample Condence Intervals for a Population Mean and Proportion 7.3 Intervals Based on a Normal Population Distribution 7.4 Condence Intervals for the Variance and Standard Deviation of a Normal Population

STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 2 / 28 Inte

Basic Properties of Condence Intervals

Consider a random sample X1 , ..., Xn from N (, 2 ) and x1 , ..., xn be the actual observations of the random sample. N (, 2 /n). Sample mean X Z= X N (0, 1) / n X 1.96) = 0.95 / n

P (1.96

STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 3 / 28 Inte

Basic Properties of Condence Intervals

P (1.96 is equivalent to

X 1.96) = 0.95 / n

1.96 + 1.96 P (X X ) = 0.95 n n Thus, 1.96 , X + 1.96 ) (X n n is a random interval that includes or covers the true value of .

STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 4 / 28 Inte

Basic Properties of Condence Intervals

1.96 (X , X + 1.96 ) n n is a random interval that includes or covers the true value of .

(1)

Denition
If, after observing X1 = x1 , X2 = x2 , ..., Xn = xn , we compute the observed , the resulting sample mean x and then substitute x into (1) in place of X xed interval + 1.96 ) ( x 1.96 , x n n is called a 95% condence interval for .

STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 5 / 28 Inte

Basic Properties of Condence Intervals


Denition
A 100(1 )% condence interval for the mean of a normal population when the value of 2 is known is given ( x z/2 , x + z/2 ) n n or, equivalently, by x z/2 n

= 0.1, z/2 = z0.05 = 1.64 = 0.05, z/2 = z0.025 = 1.96

STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 6 / 28 Inte

Example
Exercises 1: Consider a normal population with the value of known. 1 What is the condence interval level for the interval x 2.81/ n? 2 What is the condence interval level for the interval x 1.44/ n? 3 What is the value of z /2 that will result in a condence level of 99.7%?

STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 7 / 28 Inte

Large-Sample Condence Intervals for a Population Mean


Consider X1 , ..., Xn from N (, 2 ). Often, 2 is unknown. Let S be the sample standard deviation.

Proposition
If n is suciently large, the standardized variable Z= X S/ n

has approximately a standard normal distribution. This implies that s x z/2 n is a large-sample condence interval for with condence level approximately 100(1 )%. This formula is valid regardless of the shape of the population distribution.
STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 8 / 28 Inte

A Condence Interval for a Population Proportion


Let p denote the proportion of successes in a population. A random sample of n individuals is to be selected, and X is the number of successes in the sample. Provided that n is small compared to the population size, X can be regarded as a binomial rv with E (X ) = np and X = np (1 p )

Furthermore, if both np 10 and n(1 p ) 10, then X has approximately a normal distribution.

STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 9 / 28 Inte

A Condence Interval for a Population Proportion


The natural estimator of p is p = X /n, the sample fraction of successes. Since p is just X multiplied by the constant 1/n, p also has approximately a normal distribution. As we know that, E ( p ) = p (unbiasedness) and p = p (1 p )/n.

The standard deviation p involves the unknown parameter p . Standardizing p by subtracting p and dividing by p then implies that P (z/2 p p p (1 p )/n z/2 ) 1

STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 10 / 28 Inte

A Condence Interval for a Population Proportion

Proposition
Let p =
2 /2n p +z/ 2 2 /n 1+z/ 2

. Then a condence interval for a population proportion

p with condence level approximately 100(1 )% is


2 /4n2 p (1 p )/n + z/ 2 2 /n 1 + z/ 2

p z/2

STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 11 / 28 Inte

Exercise (7.2) 21
In a sample of 1000 randomly selected consumers who had opportunities to send in a rebate claim form after purchasing a product, 250 of these people said they never did so. Calculate an upper condence bound at the 95% condence level for the true proportion of such consumers who never apply for a rebate. Based on this bound, is there compelling evidence that the true proportion of such consumers is smaller than 1/3?

STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 12 / 28 Inte

Intervals Based on a Normal Population Distribution

The CI for presented earlier is valid provided that n is large. The resulting interval can be used whatever the nature of the population distribution. The CLT cannot be invoked, however, when n is small. In this case, one way to proceed is to make a specic assumption about the form of the population distribution and then derive a CI tailored to that assumption.

STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 13 / 28 Inte

Intervals Based on a Normal Population Distribution

Assumption
The population of interest is normal, so that X1 , ..., Xn constitutes a random sample from a normal distribution with both and 2 unknown.

STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 14 / 28 Inte

Intervals Based on a Normal Population Distribution

The key result underlying the interval in earlier section was that for large X has approximately a standard normal distribution. n, the rv Z = S / n When n is small, S is no longer likely to be close to s , so the variability in the distribution of Z arises from randomness in both the numerator and the denominator. This implies that the probability distribution of out than the standard normal distribution.
X S/ n

will be more spread

STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 15 / 28 Inte

Intervals Based on a Normal Population Distribution

The result on which inferences are based introduces a new family of probability distributions called t distributions.

Theorem
is the mean of a random sample of size n from a normal When X distribution with mean, the rv T = X S/ n

has a probability distribution called a t distribution with n 1 degrees of freedom (df).

STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 16 / 28 Inte

Properties of t Distributions

X Although the variable of interest is still S , we now denote it by T to / n emphasize that it does not have a standard normal distribution when n is small.

We know that a normal distribution is governed by two parameters; each dierent choice of in combination with 2 gives a particular normal distribution. Any particular t distribution results from specifying the value of a single parameter, called the number of degrees of freedom, abbreviated df.

STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 17 / 28 Inte

Properties of t Distributions

Well denote this parameter by the Greek letter . Possible values of are the positive integers 1, 2, 3,... So there is a t distribution with 1 df, another with 2 df, yet another with 3 df, and so on. For any xed value of , the density function that species the associated t curve is even more complicated than the normal density function. Fortunately, we need concern ourselves only with several of the more important features of these curves.

STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 18 / 28 Inte

Properties of t Distributions

Let t denote the t distribution with df.


1

Each t curve is bell-shaped and centered at 0. Each t curve is more spread out than the standard normal (z ) curve. As increases, the spread of the corresponding t curve decreases. As , the sequence of t curves approaches the standard normal curve (so the z curve is often called the t curve with df =).

STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 19 / 28 Inte

Properties of t Distributions

T =

X S/ n

The number of df for T is n 1 because, although S is based on the n , ..., X Xn , the fact that (Xi X ) = 0 implies that deviations X1 X only n 1 of these are freely determined. The number of df for a t variable is the number of freely determined deviations on which the estimated standard deviation in the denominator of T is based. The use of t distribution in making inferences requires notation for capturing t -curve tail areas t analogous to z for the z curve.

STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 20 / 28 Inte

Properties of t Distributions
Notation: Let t, = the number on the measurement axis for which the area under the t curve with df to the right of t, is ; t, is called a t critical value. For example, t.05,6 is the t critical value that captures an upper-tail area of 0.05 under the t curve with 6 df. Because t curves are symmetric about zero, -t, captures lower-tail area . Appendix Table A.5 gives t, for selected values of and n. The columns of the table correspond to dierent values of . To obtain t0.05,15 , go to the = 0.05 column, look down to the n = 15 row, and read t0.05,15 = 1.753.
STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 21 / 28 Inte

The One-Sample t Condence Interval

Proposition
Let x and s be the sample mean and sample standard deviation computed from the results of a random sample from a normal population with mean . Then a 100(1 )% condence interval for is s s ( x t/2,n1 , x + t/2,n1 ) n n or, more compactly, s x t/2,n1 n

STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 22 / 28 Inte

The One-Sample t Condence Interval


Example (11): Even as traditional markets for sweetgum lumber have declined, large section solid timbers traditionally used for construction bridges and mats have become increasingly scarce. The article Development of Novel Industrial Laminated Planks from Sweetgum Lumber (J. of Bridge Engr., 2008: 6466) described the manufacturing and testing of composite beams designed to add value to low-grade sweetgum lumber. Here is data on the modulus of rupture: 6807.99 7437.88 7659.50 7422.69 7637.06 6872.39 7378.61 7886.87 6663.28 7663.18 7295.54 6316.67 6165.03 6032.28 6702.76 7713.65 6991.41 6906.04 7440.17 7503.33 6992.23 6981.46 7569.75 6617.17 6984.12 7093.71 8053.26 8284.75 7347.95 7674.99

STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 23 / 28 Inte

The One-Sample t Condence Interval

Use R software.

STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 24 / 28 Inte

The One-Sample t Condence Interval

Example (12) Consider the following sample of fat content (in percentage) of n = 10 randomly selected hot dogs (Sensory and Mechanical Assessment of the Quality of Frankfurters, J. of Texture Studies, 1990: 395409): 25.2 21.3 22.8 17.0 29.8 21.0 25.5 16.0 20.9 19.5 Assuming that these were selected from a normal population distribution, nd a 95% CI for (interval estimate of) the population mean fat content. Use your calculator to obtain x and s .

STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 25 / 28 Inte

The Chi-Squared (2 ) Distribution

Denition
Let X1 , X2 , ..., Xn be a random sample from a normal distribution with parameters and 2 . Then the rv (n 1)S 2 = 2 )2 (Xi X 2

has a chi-squared (2 ) probability distribution with = n 1 df. Notation: Let 2 , called a chi-squared critical value, denote the number on the horizontal axis such that of the area under the chi-squared curve with df lies to the right of 2 , . Remark: The chi-squared distribution is not symmetric

STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 26 / 28 Inte

Condence Interval of 2
From the theorem, P (2 1/2,n1 we get the inequalities (n 1)S 2 (n 1)S 2 2 2 /2,n1 1/2,n1 A 100(1 )% condence interval for the variance 2 of a normal population is (n 1)s 2 (n 1)s 2 , ) ( 2 /2,n1 2 1/2,n1 (n 1)S 2 2 /2,n1 ) = 1 2

STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 27 / 28 Inte

(Suppl) 51
An April 2009 survey of 2253 American adults conducted by the Pew Research Centers Internet & American Life Project revealed that 1262 of the respondents had at some point used wireless means for online access. 1 Calculate an interpret a 95% CI for the proportion of all American adults who at the time of the survey had used wireless means for online access. 2 What sample size is required if the desired width of the 95% CI is to be at most 0.04, irrespective of the sample results?

STAT355 () - Probability & Statistics

Chapter Fall 2011 7: Statistical 28 / 28 Inte

Potrebbero piacerti anche