Sei sulla pagina 1di 54

1

Statistical Inference
1. Sample mean, sample variance, sample proportion
2. Comparing estimators: bias, consistency, efficiency, mean square error
3. Constructing estimators: MOM, MLE
Point Estimation
Confidence Interval Estimation
1. z-interval and t-interval
2. one-sided and two-sided intervals
3. sample size determination
Hypothesis Testing
2
Determining Quality
of Estimator
Confidence Interval
TEST
Range of values : size of range
indicates certainty
Check plausibility that
claimed parameter value
is realistic
Why use a test of hypothesis when CIs often deliver more info?

Reason : Summarizes sample so that the result (test decision) can be
easily interpreted w/out knowledge of statistics (compared to CIs)
Statistical Hypothesis Testing
3
The hypothesis test has two possible but contradictory truths, written in
terms of Hypotheses:
H
0
Null Hypothesis
Usually the simple hypothesis
The prior belief that we aim to disprove with the data.
Typically, we CANNOT prove H
0
, but merely reinforce our belief in
it.
H
A
Alternative Hypothesis
Everything but H
0

To test/support something (empirically), we usually state it as H
A

Statistical Hypothesis Testing
4
The test type can be deduced from the
alternative.
One-sided test of hypothesis
H
0
: s
0
(is claimed) vs
H
A
: >
0


Two-sided test of hypothesis
H
0
: =
0

H
A
: =
0

Often, we can write H
0
: =
0

instead.
Hypothesis Testing
5
(1) Is the coin fair? P=P(heads)
A. H
0
: P = B. H
A
: P = C. H
0
: P =
(2) A machine produces product (X) with mean = , variability = o
2

Is the variability under control? (Here the variability is under
control when it is smaller than a known value say o
0
2
)
A. H
A
: o
2
s o
0
2
B. H
0
: o
2
> o
0
2
C. H
0
: o
2
s o
0
2

Do we support the hypothesis that the machine produce an
item of a size larger than a known -?
A. H
A
: s- B. H
A
: > - C. H
0
: > -
Class Activity
6
(1) Is the coin fair? P=P(heads)
H
0
: P = (our prior belief)
H
A
: P = (the opposite of our prior belief)
(2) machine produces product (X) with mean = , variability = o
2

Is the variability under control?
H
0
: o
2
s o
0
2
, o
0
2
known
H
A
: o
2
> o
0
2

(3) Does the (same) machine produce an item of sufficient size?
H
0
: s-
H
A
: > -
Examples of Hypotheses to Test
7
Hypothesis Testing: Outline
1. Errors in Hypothesis Testing
2. Statistical Power
3. Hypothesis Test for the Mean Population
4. Hypothesis Test for the Proportion Population
5. P-values
6. Confidence intervals and Hypothesis Testing

8
This error rate can be set low enough to
ensure the test is safe.
Defines how ineffective the test is at
concluding H
A
is true when it really is.
Errors in Hypothesis Testing
o = P(type I error) = P(Reject H
0
if H
0
is true)
| = P(type II error) = P(Do not reject H
0
although H
0
is false)
The classic hypothesis test tries to fix o to be some small
(tolerable) value and just accept the type II error (|) that
results from this.
The level fixed for o is called significance level.
9
Errors in Hypothesis Testing
The following table helps us identify the two errors:
H
0
is True H
0
is False
Do not Reject H
0
Reject H
0
Correct
Decision
Type II Error
Correct
Decision
Type I Error
10
Courtroom Example
Suppose you are the prosecutor in a courtroom trial. The
defendant is either guilty or not. The jury will either convict or
not.
H
0
is True = Not Guilty H
0
is False = Guilty
Do not Reject H
0
Reject H
0
Do not convict,
defendant is not guilty
Convict even though
defendant is not guilty
Do not convict, even
though defendant is
guilty
Convict - defendant is
guilty
11
Hypothesis Testing: Outline
1. Errors in Hypothesis Testing
2. Statistical Power
3. Hypothesis Test for the Mean Population
4. Hypothesis Test for the Proportion Population
5. P-values
6. Confidence intervals and Hypothesis Testing

12
Statistical Power
Power = 1 | = P(reject null hypothesis when it is false)
Power curve: Y-axis shows power, X-axis is the value of . In
this case, H
0
: = 0.
At = 0, H
0
is true, so
this is equal to o, the
type I error rate
13
In 1990, a study on the weight of students at GT provided an average weight of
160 lbs. We would like to test our belief that the GT student weight average
did not increase in the past 15 years.
1. What is the alternative hypothesis?
A. H
A
: =160 B. H
A
: >160 C. H
A
: <160
2. Test H
0
: = 160 vs. H
A
: >160. What is Reject H
0
| = 160?
A. Type I error B. Type II error C. Power
3. Test H
0
: = 160 vs. H
A
: >160. What is P(Reject H
0
| >160)?
A. Type I error B. Type II error C. Power
Class Activity 2
14
In 1990, a study on the weight of students at GT provided an average weight of
160 lbs. We would like to test our belief that the GT student weight average
decreased in the past 15 years.
1. What is the alternative hypothesis?
B. H
A
: >160
2. Test H
0
: = 160 vs. H
A
: >160. What is P(Reject H
0
| = 160)?
A. Type I error
3. Test H
0
: = 160 vs. H
A
: >160. What is P(Reject H
0
| >160)?
C. Power
Class Activity 2 - Answers
15
Hypothesis Testing: Outline
1. Errors in Hypothesis Testing
2. Statistical Power
3. Hypothesis Test for the Mean Population
4. Hypothesis Test for the Proportion Population
5. P-values
6. Confidence intervals and Hypothesis Testing

16
Suppose we want to test
Ho: = 160 vs Ha: > 160.

The conventional way to test this hypothesis is to
find the test for which the type-I error (o) is
fixed at a particular value (e.g., o = 0.01, 0.05,
0.10).
Hypothesis Test for
Mean Population
17
Example, continued
Example: if o is known, we would reject if
is large, relative to the hypothesized mean ( =
160), so the test is to reject Ho if > c.
X
X
)
n
160 - c
P(Z 160) | c X P(
o
o > = = > =
o
o
z
n
160 - c
This determines c because
18
Example
Note: if we have an o = .05 level test, then
Z
.05
= 1.645
( ) 160 n c + =
o
o z
( ) 160 n x + >
o
o z and we would reject H
0
when
19
Different Tests for
Variance known (not typically realistic)
Variance unknown (use s
2
instead of o
2
)
Different Hypotheses
Ho:
0
vs Ha: <
0
.
Ho:
0
vs Ha: >
0
.
Ho: =
0
vs Ha:
0
.
20
o
-Z
n
X
Z
0
<

=
o
Z Z >
2

Z | Z | >
If o is known, we base the test on the z-statistic:
Ho:
0
vs Ha: <
0
.
Ho:
0
vs Ha: >
0
.
Ho: =
0
vs Ha:
0
.
Test with known variances
Hypotheses Rejection Region
21
Test with unknown variances

n s
- X
t
0
1 - n
=
If o unknown, replace it with s and use the
analogous t-statistic in place of z:
key: the test result depends on the claimed value
0

22
1. The significance level of an hypothesis test is
A. 1-P(type II error) B. P(Reject H
0
if H
0
is true) C. P(type II error)
2. A one-sided test for the mean population has an alternative hypothesis
A. H
0
:
0
B. H
A
:
0
C. H
A
: >
0

3. A two-sided test for the mean population has alternative hypothesis
A. H
A
:
0
B. H
0
:
0
C. H
0
:
0

4. When the variance is unknown and the sample size is large, we use
A. z-test B. t-test C. an approximate z-test

Class Activity 3
23
90 : H
90 : H
A
0
=
=
n
- X
Z where Z | Z | if reject
0
2

= >
2.36
44 5.9
90 - 87.9
Z = =
Example of Two-sided Test
Test to see if the mean is significantly far from 90. The
sample mean is 87.9 with known standard deviation of 5.9
for a sample size equal to 44.
24
If we use the o=.10 level test: [ reject H
0
if |Z| > 1.645]

What is the power of the test?
To examine power (1-|), we need to consider various possible
truths about H
A
: =90
Note: We will get different values for different values of = 90
Example, Continued
Power = P(Reject H
0
if H
A
is true)
25
Example, Continued

n
90 - X
Z =
We need to re-standardize the Z in the probability above to
figure out what the power is.
Suppose = 91 (so H
A
is true)

Power = P( |Z| > 1.645 given = 91)
is no longer distributed N(0,1)
26
|
.
|

\
|
o
<
o
+
|
.
|

\
|
o
>
o
=
|
.
|

\
|
<
o
+
|
.
|

\
|
>
o
=
|
|
.
|

\
|
>
o

n /
1
- 1.645 -
n /
1 - 90 - X
P
n /
1
- 1.645
n /
1 - 90 - X
P
1.645 -
n /
90 - X
P 1.645
n /
90 - X
P
1.645
n /
90 - X
P
Since = 91, this is needed to make the RHS be Z ~ N(0,1)
Example: the messy part
27
-3 -2 -1 1 2 3
0.1
0.2
0.3
0.4
-3 -2 -1 1 2 3
0.1
0.2
0.3
0.4
( ) ( )
( ) ( )
.3043
0.0028 0.3015
2.77 - Z P 0.52 Z P
1.124 - 1.645 - Z P 1.124 - 1.645 Z P
=
+ =
< + > =
< + > =
POWER = 0.3043
Good : its higher than o = .10
Bad : 70% chance of a type II error
Example, Continued
28
1. Does the power of a test depend on the significance level?
A. No B. The power increases as o increase
C. The power increases as o decreases
2. Does the power of a test depend the variance o?
A. No B. The power increases as o increase
C. The power increases as o decreases
3. Does the power of a test depend the sample size n?
A. No B. The power increases as n increase
C. The power increases as n decreases
Class Activity 4
29
Procedure: Hypothesis Test for
1. Set the significance level o (=.01,.05,.1)
2. Set the null and the alternative
- is the test one-sided or two-sided?
3. Type of the test: z-test or t-test
- known variance (z-test)
- unknown variance (use s
2
instead of o
2
and use t-test)
4. Decision about the null hypothesis at significance level o
- Reject the null (prove the alternative)
- Accept the null (the null hypothesis is plausible)

30
Hypothesis Testing: Outline
1. Errors in Hypothesis Testing
2. Statistical Power
3. Hypothesis Test for the Mean Population
4. Hypothesis Test for the Proportion Population
5. Confidence intervals and Hypothesis Testing
6. P-values


31
Hypothesis Tests for p
Test Statistic:
z
p p
p p n
0
0
0 0
1
=

( ) /
For test of H
o
: p=p
0
vs. H
a
: p p
0
Reject H
o
if
Z z
0
2
>
o
For test of H
o
: p > p
0
vs. H
a
: p < p
0

Reject H
o
if

Z z
0
<
o
32
n=50,000 trials, x=25,264 zeros, p = P(generate a zero),
and = x/n = 0.5053

p
(a) H
o
: p=p
0
vs. H
a
: p not equal to p
0

z
p p
p p n
0
0
0 0
1
05053 05
5 5 50 000
=

( ) /
. .
(. )(. ) / ,
= 2.361
Inference on Binomial Parameter p: Example
A random number generator produces 0s and 1s with equal
probability. After 50,000 runs, we observe 25,264 0s.
33

(

) / p z p p n
o
2
1
= (0.4995, 0.5110)
In 99% of the samples made this way, the
constructed CI will contain the true
parameter p = P(generating a zero)
2-sided interval:
Inference on Binomial Parameter p: Example
34
Hypothesis Testing: Outline
1. Errors in Hypothesis Testing
2. Statistical Power
3. Hypothesis Test for the Mean Population
4. Hypothesis Test for the Proportion Population
5. Confidence intervals and Hypothesis Testing
6. P-values


35
CI :
2
1, - n
t
n
s
- X
o
2
1, - n
t
n
s
X
o
+
X
Test :
If we wish to test
o A o o
: H vs. : H = =
If
o
is not in the interval, we
would reject H
o
at o
If
o
is inside the CI, do not
reject at level o
Confidence Intervals or Hypothesis
Test
36
(a) For o = .10 test, we reject H
o
if
> t
60,.1
= 1.296
n / s
0.065 -
t
X
=
(note : if you used Z
.10
, the critical value is 1.282 )
So we do not reject H
o
if t < 1.296
(c) If = 0.0768, s = 0.0231
X
Then t = 3.99, and we reject H
o
at o = .01
Ho: m 0.065 vs. Ha: m > 0.065
37
Hypothesis Testing: Outline
1. Errors in Hypothesis Testing
2. Statistical Power
3. Hypothesis Test for the Mean Population
4. Hypothesis Test for the Proportion Population
5. Confidence intervals and Hypothesis Testing
6. P-values


38
Introduction to P-values
I. Decision making based on the significance level o
(=.01,.05,.1)
II. Decision making based on p-value
P-VALUE = measure of the null hypothesis plausibility based on
the sample data.
The smaller the p-value is, the less
plausible the null hypothesis is.
If P-value small, we reject H
0

39
Convention:
Decisions based on P-value
If P-value < 0.01
H
0
is not plausible/ H
A
is supported
We would need to specify o even smaller in order to accept H
0

If P-value > 0.1
H
0
is plausible
Few significance test levels higher than .1
If 0.01 < P-value < 0.1
we have some evidence that H
0
is not plausible but we need
further investigation
40
Test Statistic and P-value
A test statistic is a function of the sample X
1
,X
n
on which the
decision (reject or do not reject H
0
) is to be based.

T(X
1
,X
n
)
measure of discrepancy between data and H
0

1. Known variance:


2. Unknown variance:

n
X
X X T
n
/
) ,..., (
0
1
o

=
n S
X
X X T
n
/
) ,..., (
0
1

=
41
P-value Computation: Hypothesis
Test for
Unknown variance:



1. Two-sided test


2. One-sided test (H
A
: >
0
)


3. One-sided test (H
A
: <
0
)

n s
x
x
n S
X
X X T
n n
/
) ,..., t(x ,
/
) ,..., (
0
1
0
1

=

=
|) | ( 2 |) ) ,..., t(x | | ) ,..., ( (|
1 1
t T P x X X T P value p
n n
> = > =
) ( 1 ) ( )) ,..., t(x ) ,..., ( (
1 1
t t T P x X X T P value p
n n
u = > = > =
) ( ) ( )) ,..., t(x ) ,..., ( (
1 1
t t T P x X X T P value p
n n
u = < = < =
42
A cigarette manufacturer claims that the average nicotine content of a brand
of cigarettes is at most
0
= 1.5. We observe the nicotine content for 100
cigarettes with a sample mean =1.7 and sample standard error s = 1.3.
1. What is the alternative hypothesis?
A. H
A
: 1.5 B. H
A
: < 1.5 C. H
A
: > 1.5
2. At a significance level o=.01, what is your decision?
A. strongly suggest H
A
B. H
0
is plausible C. strongly support H
0

3. What is the p-value? Denote
A. P(Z z) B. P(Z z) C. P(|Z| |z|)
Class Activity 5
x
3 . 1 / ) 5 . 1 7 . 1 ( 100 z and / ) (
0
= = S X n Z
43
Intuition for the P-value
Given the observed data x, the p-value p(x) for an hypothesis test is
the probability of obtaining the observed sample of data x or even a
more rejectable sample of data when the null hypothesis is true.
1. A more rejectable sample is x for which the
null hypothesis is even less plausible than for x.
2. We compute the p-value assuming that the
null hypothesis is true.
44
n = 13, = 2.879, o = 0.325
H
0
: = 3 vs. H
A
: 3
-3 -2 -1 1 2 3
0.1
0.2
0.3
0.4
-3 -2 -1 1 2 3
0.1
0.2
0.3
0.4
X
18 . 0 (-1.34) 2
) 34 . 1 | Z | P(
)
13 325 .
.121
| Z | P( ) 121 . | X (| P
= u =
> =
> = > =
Intuition for the p-value
p-value = P(getting a more rejectable value of )
= P( is more than |3-2.879|=0.121 away from )
X
X
45
Class Activity 6
Test to see if the mean is significantly far from 90. The
sample mean is 87.9 with known standard deviation of 5.9
for a sample of size 44.
1. What is the alternative hypothesis?
A. H
A
: 90 B. H
A
: < 90 C. H
A
: > 90
2. At a significance level o=.1, what is your decision?
A. strongly suggest H
A
B. H
0
is plausible C. strongly support H
0

3. What is the p-value? Denote
A. P(Z z) B. P(Z z) C. P(|Z| |z|)
9 . 5 / ) 90 9 . 87 ( 44 z and / ) (
0
= = S X n Z
46
90 : H
90 : H
A
0
=
=
2.36
44 5.9
90 - 87.9
z : statistic test Observed
n /
- X
Z : statistic Test
0
= =
=
o

Example of Two-sided Test


Test to see if the mean is significantly far from 90. The
sample mean is 87.9 with known standard deviation of 5.9.
47
-3 -2 -1 1 2 3
0.1
0.2
0.3
0.4
-3 -2 -1 1 2 3
0.1
0.2
0.3
0.4
90 : H
90 : H
A
0
=
=
Note: if we have an o = .10 level test, then Z
.05
= 1.645 and we
would reject H
0
( |Z| > 1.645). That means the p-value < 0.10.
P-value and significance level
P-value = P(more rejectable Z)
= P(|Z| > 2.36)
= 2(.0091) = .0182
n
- x
z where z | z | if reject
0
2

= >
48
N=50,000 trials, x=25,264 zeros, p = P(generate a zero),
and = x/n = 0.5053

p
H
o
: p=p
0
vs. H
a
: p p
0
Binomial Parameter p: P-value Example
A random number generator produces 0s and 1s with equal
probability. After 50,000 runs, we observe 25,264 0s.
P-value = P(Z<-2.361 or Z>2.361)
= 2P(Z<-2.361) = 0.0182
49
In an experiment designed to measure the time necessary for
an inspector's eyes to become used to the reduced amount of
light necessary for penetrate inspection, the sample average
time for n = 9 inspectors was 6.32 sec and the sample
standard deviation was 1.65 sec. It has previously been
assumed that the average adaptation time was at least 7 sec.
Assuming adaptation time to be normally distributed, does the
data contradict the prior belief?

Example 1
50
Recent information suggests that the obesity is an increasing
problem in America among all age groups. The Associated
Press (October 9, 2002) reported that 1276 individuals in a
sample of 4115 adults were found to be obese (a body mass
index exceeding 30, where this index is a measure of weight
relative to height). A 1998 survey based on people's own
assessment revealed that 20% of adult Americans considered
themselves obese. Does the recent data suggest that the true
proportion of adults who are obese is more than 1.5 times the
percentage from the self-assessment survey?
Example 2
51
The average height of females in the freshman class at GT has
been 162.5 cm with a a standard deviation of 6.9cm. Is there
reason to believe that there has been a change in the average
height if a random sample of 50 females in the present
freshman class has an average height of 165.2cm? Use a P-
value in your conclusion. Assume the standard deviation
remains the same.

Example 3
52
In a survey from 2000, it has been found that 33%
of the adults favored capital punishment. Do we
have reason to believe that the proportion of adults
favoring capital punishment today has decreased
if, in a random sample of 85 adults, 26 favor capital
punishment? Base your decision using the P-
value.
Example 4
53
A large manufacturing firm is being charged with
discrimination in its hiring practices.
What hypothesis is being tested if a jury commits a
type I error by finding the firm guilty?
What hypothesis is being tested if a jury commits a
type II error by finding the firm guilty?
Example 5
54
A soft-drink machine at a steak house is regulated so that the
amount of drink dispensed is approximately normally
distributed with a mean of 200 ml and standard deviation of
30 ml. The machine is checked periodically by taking a
sample of 9 drinks and computing the average content. If the
sample mean falls in the in interval (191,209), the machine is
thought to be operating satisfactorily; otherwise we conclude
that the true mean is not equal to 200 ml.
(a) Find the probability of committing a type I error when the
true mean is equal to 200 ml.
(b) Find the probability of committing a type 2 error when the
true mean is equal to 215 ml.
Example 6

Potrebbero piacerti anche