Sei sulla pagina 1di 7

30-09-2014

1
Interval Estimation and
Sample Size Decision
Point estimation
Interval estimation for
Population Mean
Population Proportion
Population Variance
Sample size decision in estimating
Population Mean
Population Proportion
Population Variance
QAM II by Gaurav Garg (IIM Lucknow)
Statistical Estimation
We take data from a sample and say something about the
population from which the sample was drawn
Sample statistic is used to estimate unknown parameter.
There are two types of estimation:
Point Estimation:
Calculation of a single value of a sample statistic
Interval Estimation
Calculation of an interval using a sample statistic
This interval is calculated at a desired level of confidence
Eg. 95% confidence, 99% confidence, can not be 100%
Sample to sample variation (standard error) is also taken
into consideration.
QAM II by Gaurav Garg (IIM Lucknow)
Let be the unknown parameter.
Suppose T is the point estimate of and E(T) =.
Fix the confidence level at (1- o )x100 %.
o is the probability of error.
(1- o) is called confidence coefficient.
Thus, for 95% confidence level, o =0.05.
Confidence interval estimate of is [T-h, T+h]
It means that P(T-h T+h) =1- o
Where, h =critical value x standard error
QAM II by Gaurav Garg (IIM Lucknow)
Confidence Interval Estimates Formula for confidence interval is [T-h, T+h]
T =Unbiased (Point) Estimate of the unknown
parameter
h =critical value x standard error of the estimate
Critical Value is obtained using confidence coefficient
(1- o ) (will be discussed later)
Lower Confidence Limit =T-h
Upper Confidence Limit =T+h
QAM II by Gaurav Garg (IIM Lucknow)
Point Estimate
Lower Confidence Limit Upper Confidence Limit
Width of
confidence interval
Using Central Limit Theorem, for large sample
Where T is the unbiased point estimate of
SE(T) is the standard error of T.
Confidence coefficient is fixed as (1- o ).
Critical value is given by z
o/2
as below
P(-z
o/2
<Z <z
o/2
) =(1- o ), where Z~N(0,1).
QAM II by Gaurav Garg (IIM Lucknow)
) 1 , 0 ( ~
) (
N
T SE
T
Z
u
=
N(0,1)

For Z~N(0,1)
This implies
or
Thus (1- o )x100 % Confidence interval estimate of is
[T - z
o/2
x SE(T), T +z
o/2
x SE(T)]
QAM II by Gaurav Garg (IIM Lucknow)
o
u
o o
=
|
|
.
|

\
|
<

< 1
) (
2 / 2 /
z
T SE
T
z P
( ) o u
o o
= + < < 1 ) ( ) (
2 / 2 /
T SE z T T SE z T P
) 1 , 0 ( ~
) (
N
T SE
T
Z
u
=
( ) o
o o
= < < 1
2 / 2 /
z Z z P
30-09-2014
2
Confidence Interval for Population Mean
( Known)
When
Population standard deviation is known
Population is normally distributed
If population is not normal, sample size is large
(1- o )x100 % Confidence interval estimate of
is given by
where P(-z
o/2
<Z <z
o/2
) =(1- o ), Z~N(0,1).
QAM II by Gaurav Garg (IIM Lucknow)
|
.
|

\
|
+
n
z x
n
z x
o o
o o 2 / 2 /
,
Commonly used confidence levels and corresponding
critical values (N(0,1) Distribution)
QAM II by Gaurav Garg (IIM Lucknow)
-z
o/2
= - 1.96
z
o/2
= 1.96
.95 0 1 = o
025
2
.

= 025
2
.

=
0
Confidence Level
Confidence
Coefficient Critical Value
80% 0.8 0.2 1.28
90% 0.9 0.1 1.645
95% 0.95 0.05 1.96
98% 0.98 0.02 2.33
99% 0.99 0.01 2.58
99.80% 0.998 0.002 3.08
99.90% 0.999 0.001 3.27
N(0,1)
QAM II by Gaurav Garg (IIM Lucknow)

x
=
Distribution of the Sample Mean
|
|
.
|

\
|
+
n

z x
n

z x
/ / 2 2
,
sampl es) di fferent (for I nterval s Confi dence
/2 o /2 o
o 1
( ) n N o ,
sampl es di fferent for
Mean Sampl e of Val ue x
(1-o) x100%
of intervals will
contain .
Example:
A sample of 11 circuits from a large normal population
has a mean resistance of 2.20 ohms.
We know from past testing that the population standard
deviation is 0.35 ohms.
Determine a 95% confidence interval for the true mean
resistance of the population.
Ans.
QAM II by Gaurav Garg (IIM Lucknow)
2.4068) , (1.9932
0.2068 2.20
) 11 (0.35/ 1.96 2.20
n


=
=

) 025 . 0 (
z x
Confidence Interval for Population Mean
( Unknown)
Use unbiased estimate of , given by
Case 1: nis small
Value of s
1
varies sample to sample
This increases extra variability
Normal distribution can not be used
We use t distribution with (n -1) d.f.
Case 2: nis large
When n is large, t distribution approaches normal distribution
We use N(0,1) distribution
QAM II by Gaurav Garg (IIM Lucknow)

=
n
i
i
x x
n
s
1
2
1
) (
1
1
Case 1: is unknown and n is small
Assumption: Population has normal distribution
(1- o )x100 % Confidence interval estimate of is given
by
Where t
o/2
is given such that
P(-t
o/2
<T <t
o/2
) =(1- o ), for T ~t
(n-1)
.
QAM II by Gaurav Garg (IIM Lucknow)
|
.
|

\
|
+
n
s
t x
n
s
t x
1
2 /
1
2 /
,
o o
30-09-2014
3
Some Critical Values of t
(n-1)
distribution for
given and d.f. (n-1)
QAM II by Gaurav Garg (IIM Lucknow)
-t
o/2
t
o/2
o 1
2

0
d.f.
(n-1)
Critical Value
at =0.05
Critical Value
at =0.10
1 12.706 6.314
2 4.303 2.92
3 3.182 2.353
4 2.776 2.132
5 2.571 2.015
6 2.447 1.943
7 2.365 1.895
t
(n-1)
Consider the same example
A sample of 11 circuits from a large normal population
has a mean resistance of 2.20 ohms.
Population standard deviation is not known.
Sample standard deviation (s
1
) is 0.35 ohms.
Determine a 95% confidence interval for the true mean
resistance of the population.
Ans.
QAM II by Gaurav Garg (IIM Lucknow)
) . , . (
. .
) / . ( .
n
s
t x
) . (
4351 2 9649 1
2351 0 20 2
11 35 0 22814 . 2 20 2
1
025 0
=
=

2 2
1
1
s
n
n
s

=
If we are given s
2
, we
can use following
formula
Case 2: is unknown and n is large
Population may or may not have normal distribution
(1- o )x100 % Confidence interval estimate of is given
by
Where z
o/2
is given such that
For Z~N(0,1), P(-z
o/2
<Z <z
o/2
) =(1- o ).
QAM II by Gaurav Garg (IIM Lucknow)
|
.
|

\
|
+
n
s
z x
n
s
z x
1
2 /
1
2 /
,
o o
QAM II by Gaurav Garg (IIM Lucknow)
Confidence Interval Estimate of
known Unknown
n small n large
Normal
Distribution
Any
Distribution
|
.
|

\
|
+
n
z x
n
z x
o o
o o 2 / 2 /
,
n small n large
Normal
Distribution
Any
Distribution
|
.
|

\
|
+
n
s
t x
n
s
t x
1
2 /
1
2 /
,
o o
|
.
|

\
|
+
n
s
z x
n
s
z x
1
2 /
1
2 /
,
o o
Confidence Intervals for Population Proportion
Case 1:
Small Sample: out of scope
Case 2:
Large Sample
We know that for large n
For Z~N(0,1), we have
QAM II by Gaurav Garg (IIM Lucknow)
) 1 , 0 ( ~
) 1 (
N
n
p
Z
t t
t

=
( ) o t t t t t
o
t t
t
o
o o
o o
o o
= + < <
=
|
|
.
|

\
|
<

<
= < <
1 ) 1 ( ) 1 ( or
1
) 1 (
or
1 ) (
2 / 2 /
2 / 2 /
2 / 2 /
n z p n z p P
z
n
p
z P
z Z z P
Thus (1- o )x100 % CI estimate of is given by
This expression itself contains . Which is
unknown
So, this CI estimate becomes meaningless.
We use the unbiased estimate of
Then, (1- o )x100 % CI estimate of is given by
Where q=1-p.
Required Assumption: Large Sample only.
QAM II by Gaurav Garg (IIM Lucknow)
( ) n z p n z p ) 1 ( , ) 1 (
2 / 2 /
t t t t
o o
+
( ) n pq z p n pq z p +
2 / 2 /
,
o o
30-09-2014
4
Example:
A random sample of 100 people shows that 25
have opened IRA (individual retirement
arrangement) this year.
Construct a 95% confidence interval for the true
proportion of population who have opened IRA.
Ans
QAM II by Gaurav Garg (IIM Lucknow)
) . , . (
) (. . .
)/ . ( . . /
p)/n p( z p
) . (
3349 0 1651 0
0433 96 1 25 0
100 75 0 25 0 96 1 100 25
1
025 0
=
=
=

Confidence Interval for Population Variance o
2
Variance is an inverse measure of the groups
homogeneity.
Variance is an important indicator of total quality in
standardized products and services.
Managers improve processes by reducing variance.
Variance is a measure of financial risk.
Variance of rates of return help managers assess
financial and capital investment alternatives.
Variability is a reality in global markets.
Productivity, wages, and costs of living vary between
regions and nations.
QAM II by Gaurav Garg (IIM Lucknow)
Confidence Interval for Population Variance o
2
Case 1:
Small Sample
Parent Population is Normal
Let us take a sample from N(,).
Then,
We know that
So,
n
x x x ,..., ,
2 1
QAM II by Gaurav Garg (IIM Lucknow)
2
) 1 (
2
1
2
~

=

|
.
|

\
|
=
n
n
i
i
x x
_
o
_

=
n
i
i
x x
n
s
1
2 2
1
) (
1
1
2
) 1 ( 2
2
1 2
~
) 1 (

=
n
s n
_
o
_
Then, (1- o )x100 % CI estimate of o
2
is given by
Or
Here, are critical values obtained
using Chi Square distribution with (n-1) d.f.
QAM II by Gaurav Garg (IIM Lucknow)
( ) ( )
_
o
_
o o
2
2 / 1
2
1
2
2
2 /
2
1
1 1

s s
s n s n
( ) ( )
|
|
|
.
|

\
|

_ _
o o
2
2 / 1
2
1
2
2 /
2
1
1
,
1 s n s n
_ _
o o
and
2
2 / 1
2
2 /
QAM II by Gaurav Garg (IIM Lucknow)
df =7
= 0.10
/2 = 0.05
/2 = 0.05
1- =0.90
2.167 14.067
QAM II by Gaurav Garg (IIM Lucknow)
30-09-2014
5
Example:
The cholesterol concentration in the yolks of a
sample of 18 randomly selected eggs laid by
genetically engineered chickens were found to
have a mean value of 9.38 mg/g of yolk and a
standard deviation of 1.62 mg/g.
Use this information to construct a confidence
interval estimate of the true variance of the
cholesterol concentration in these egg yolks.
QAM II by Gaurav Garg (IIM Lucknow)
Confidence Interval for Population Variance o
2
Case 2:
Large Sample
Parent Population may or may not be Normal
We know that
Also, (Proof is out of scope)
So, for large samples.
Using this, (1- o )x100 % CI estimate of o
2
is given by

QAM II by Gaurav Garg (IIM Lucknow)


2 2
1
) ( o = s E
) 1 ( 2 ) .( .
2 2
1
= n s E S o
) 1 , 0 ( ~
) 1 ( 2
2
2 2
1
N
n
s

o
o
|
|
.
|

\
|
+ ) 1 ( 2 1
,
) 1 ( 2 1
2 /
2
1
2 /
2
1
n z
s
n z
s
o o
Example:
A technologist is developing a new method for processing
a food material.
For best quality, it is important to control moisture content
in the final product.
So, as one part of determining the practicality of the new
method, the technologist must estimate the variability of
water content in the resulting product.
He collects 50 specimens of product from the new
process, and determines the percent water in each.
These 50 specimens give a sample mean water content of
43.24% and a sample standard deviation of 7.93%.
Compute a 95% confidence interval estimate of the true
variance of the percentage water for this new process.
QAM II by Gaurav Garg (IIM Lucknow)
Sample Size Decision
(when Estimating )
We have seen (for sufficiently large n) that
Error of Estimation
Fix the confidence level at (1- o )x100 %
Obtain critical value is z
o/2
using N(0,1) such that
Then, we have
QAM II by Gaurav Garg (IIM Lucknow)
) , ( ~ n N x o
) 1 , 0 ( ~ or N
n
x
Z
o

=
= x e
2
2 /
or |
.
|

\
|
=
e
z
n
o
o
n
e
z
o
o
=
2 /
Thus the sample size for estimating population mean
is
Critical value z
o/2
can be taken from the table.
Estimation Error (e) should be fixed by the researcher in
advance.
Clearly, e 0
Population standard deviation can be estimated from
some other small sample or pilot survey as
Range/6 or by sample standard deviation
QAM II by Gaurav Garg (IIM Lucknow)
2
2 /
|
.
|

\
|
=
e
z
n
o
o
Example:
In a pilot survey, it is observed that the smallest
observation is 6 and the largest observation is 276.
What should be the sample size needed to estimate the
population mean within 5 with 90% confidence level?
Ans.
QAM II by Gaurav Garg (IIM Lucknow)
219 19 . 219
5
645 . 1 45
So,
645 . 1 value critical level, confidence 90% For
5 Error Estimation
45
6
6 276
deviation standard population of Estimate
2 2
05 . 0
) 05 . 0 (
~ = |
.
|

\
|
= |
.
|

\
|
=
=
=
=

=
e
z
n
z
e
o
o
30-09-2014
6
Sample Size Decision
(when Estimating )
Similarly, the sample size for estimating population
proportion is given by
For fixed confidence coefficient (1- o ), critical value z
o/2
can
be taken from the normal table.
Estimation Error (e =|p |) should be fixed by the
researcher in advance. Clearly, e 0
Population proportion P can be estimated from some other
small sample or pilot survey.
If no information is available, it can be decided by the
researcher using past experience or can be taken as 0.5.
QAM II by Gaurav Garg (IIM Lucknow)
2
2
2 /
) ( ) 1 (
e
z
n
o
t t
=
Example:
How large a sample would be necessary to
estimate the true proportion defective in a large
population within 3%, with 95% confidence?
(Assume a pilot sample yields p = 0.12)
Ans.
QAM II by Gaurav Garg (IIM Lucknow)
451 75 . 450
03 . 0 03 . 0
96 . 1 96 . 1 88 . 0 12 . 0 ) (
So,
96 . 1 value critical level, confidence 95% For
03 . 0 100 / 3 Error Estimation
12 . 0 proportion population of Estimate
2
2
025 . 0
) 025 . 0 (
~ =


= =
=
= =
=
e
z pq
n
z
e
p
Sample Size Decision
(when Estimating o
2
)
We know, for large samples,
Similarly, the sample size for estimating population variance o
2
is
given by
For fixed confidence coefficient (1- o ), critical value z
o/2
can be
taken from the normal table.
Estimation Error should be fixed by the
researcher in advance. Clearly, e 0
Population variance o
2
can be estimated from some other small
sample or pilot survey.
If no information is available, it can be decided by the researcher
using past experience or can be taken as the square of Range/6.
QAM II by Gaurav Garg (IIM Lucknow)
) 1 , 0 ( ~
) 1 ( 2
2
2 2
1
N
n
s

o
o
2
2
2 /
4
2
1
e
z
n
o
o
+ =
2 2
1
o = s e
Estimating Total
In auditing, one is more interested to get the estimate of
population total amount.
The point estimate of it can be given by
The CI estimate at (1- o )x100 % confidence level is given by
fpc should be used when n / N >0.05
QAM II by Gaurav Garg (IIM Lucknow)
x N
size) sample (large on) distributi normal size, sample (small
1
2 /
1
2 / |
|
.
|

\
|

|
|
.
|

\
|

n
s
z N x N
n
s
t N x N
o o
size) sample (large on) distributi normal size, sample (small
1 1
1
2 /
1
2 /
|
|
.
|

\
|


|
|
.
|

\
|


N
n N
n
s
z N x N
N
n N
n
s
t N x N
o o
Example: A firm has a population of 1000 accounts and
wishes to estimate the total population value.
A sample of 80 accounts is selected with average
balance of $87.6 and standard deviation of $22.3.
Find the 95% confidence interval estimate of the total
balance.
Ans:
QAM II by Gaurav Garg (IIM Lucknow)
3 22 6 87 80 1000
1
. , s . x , , n N = = = =
) 48 . 92362 , 52 . 82837 (
48 762 4 600 87
1 1000
80 1000
80
3 22
96 1 1000 6 87 1000
1
1
025 0
=
=

. , ,
.
) . )( ( ) . )( (
N
n N
n
s
z N x N
.
Estimating Total Difference
An auditor may wish to estimate the magnitude of
errors
An error is the difference of the values reached
during audit and the original values recorded.
A sample of size nitems is collected.
Let D
i
denote the error in the i
th
item (i=1,2,,n).
D
i
= 0, if the auditor finds that the original value is correct
D
i
> 0, if the audited value is larger than the original value
D
i
< 0, if the audited value is smaller than the original value
QAM II by Gaurav Garg (IIM Lucknow)
30-09-2014
7
Define:
Point Estimate of Total Difference is
CI estimate of Total Difference
fpc should be used when n / N >0.05
QAM II by Gaurav Garg (IIM Lucknow)

= =

= =
n
i
i D
n
i
i
D D
n
s D
n
D
1
2
1
) (
1
1
and
1
D N
samples) large (for on) distributi normal samples, small (for
2 / 2 / |
|
.
|

\
|

|
|
.
|

\
|

n
s
z N D N
n
s
t N D N
D D
o o
samples) large (for on) distributi normal samples, small (for
1 1
2 / 2 /
|
|
.
|

\
|


|
|
.
|

\
|


N
n N
n
s
z N D N
N
n N
n
s
t N D N
D D
o o
Example:
Econe Dresses has 1200 inventory items.
In the past 15% items were incorrectly priced.
A sample of 120 items was selected.
Historical cost of each item was compared with
the audited value.
15 items differ in their historical costs and
audited values.
These values are as follows:
QAM II by Gaurav Garg (IIM Lucknow)
QAM II by Gaurav Garg (IIM Lucknow)
Historical
Cost
Audited
Value
D
i
261 240 21
87 105 -18
201 276 -75
121 110 11
315 298 17
411 356 55
249 211 38
216 305 -89
21 210 -189
140 152 -12
129 112 17
340 216 124
341 402 -61
135 97 38
228 220 8
24482 . 25
95833 . 0
1200 , 120
=
=
= =
D
s
D
N n
]
1 1200
120 1200
120
24482 . 25
96 . 1 1200
) 95833 . 0 ( 1200 [
1
) 025 . 0 (


=
|
|
.
|

\
|

N
n N
n
s
Nz D N
D
i s CI 95%
n/N =120/1200 =0.1 >0.05,
So we use fpc
Population
Mean ()
is
know
n
Small sample
(Normal Distribution)
Large sample
(Any Distribution)
is
not
know
n
Small sample
(Normal Distribution)
Large sample
(Any Distribution)
Population
Proportion ()
Small sample
OUT OF SCOPE
Large sample
(Any Distribution)
Population
Variance (
2
)
Small sample
(Normal Distribution)
Large sample
(Any Distribution)
n
z x
o
o

2 /

n
s
t x
1
2 /

o

n
s
z x
1
2 /

o

n pq z p
2 / o

( ) ( )
_ _
o o
2
2 / 1
2
1
2
2 /
2
1
1
,
1

s n s n
) 1 ( 2 1
2 /
2
1
n z
s
o
S
U
M
M
A
R
Y

(
I
N
T
E
R
V
A
L

E
S
T
I
M
A
T
E
S
)
QAM II by Gaurav Garg (IIM Lucknow)
For estimating
Population Mean
()
Large sample
(Any Distribution)
For estimating
Population
Proportion ()
Large sample
(Any Distribution)
For estimating
Population
Variance (
2
)
Large sample
(Any Distribution)
S
U
M
M
A
R
Y

(
S
A
M
P
L
E

S
I
Z
E

D
E
C
I
S
I
O
N
)
QAM II by Gaurav Garg (IIM Lucknow)
2
2
2 /
4
2
1
e
z
n
o
o
+ =
2
2
2 /
) ( ) 1 (
e
z
n
o
t t
=
2
2 /
|
.
|

\
|
=
e
z
n
o
o

Potrebbero piacerti anche