Sei sulla pagina 1di 4

Sampling Distribution and Point Estimation of Parameters

The most important phase of a statistical investigation, called statistical inference, is a procedure
whereby, on the basis of observed experimental data in a particular sample, we generalize and draw
conclusions about the population from which the sample is drawn. Among other things, this procedure is
concerned with estimation of unknown parameter values and tests of hypothesis regarding these values.

The object of study of an investigator is the population. But the investigation of the entire
population of interest may not be feasible for some reasons. The only recourse is to pick a sample of a
certain size from the population. From the sample data that are collected, we compute the values of
appropriate statistics. On the basis of the values of the statistics, we draw inferences about the
corresponding population parameters that serve to explain the population.

Sampling Distributions
By the sampling distribution of the statistic, we mean the theoretical probability distribution of
the statistic. The probability distribution of a statistic is called a sampling distribution.

The sampling distribution of a statistic depends on the distribution of the population, the size of
the samples, and the method of choosing the samples.

Sampling with Replacement


If samples of size n are drawn with replacement from a population with mean μ and variance σ2 ,
then
μ X = μ , that is, the population mean
and
σ2 the population variance
σ 2X = , that is,
n the sample size

Sampling Without Replacement


When random samples of size n are drawn without replacement from a finite population of size N
that has mean μ and variance σ2 , the mean and variance of the sampling distribution of 𝑋̅ are given by
μX = μ
and
σ2 N - n
σ 2X = .
n N -1
If the population size is large compared to the sample size,
σ2
σ 2X =
n
Standard Error of the mean
The standard deviation of the sampling distribution of 𝑋̅ is commonly known as the standard
σ
error of the mean. It is when sampling with replacement. For sample drawn without replacement
n
σ N -n
from a finite population of size N, the standard error of the mean is . In the later case it is
n N -1
σ
approximately if the population is very large compared to the sample size.
n

Example. A sample of 36 is selected at random from a population of adult males. If the standard deviation
of the distribution of their heights is known to be 3 inches, find the standard error of the mean if
a. The population consists of 1000 males
b. The population is extremely large.

The Central Limit Theorem


The distribution of the sample mean 𝑋̅ of a random sample drawn from practically any
population with mean μ and variance σ2 can be approximated by means of a normal distribution with
mean  and variance 2/n provided the sample size is large.

Alternate Version of the Central Limit Theorem


When the sample size is large, the distribution of
X -μ
σ/n
is close to that of the normal variable Z.

a-μ b-μ
Thus P ( a < X < b) ≈ P ( <Z< )
σ/n σ/n

If the parent population is normally distributed, then the distribution of ̅𝑋 is exactly normal for
any sample size.

Example.
1. The records of an agency show that the mean expenditure incurred by a student during 2016 was
$8000 and the standard deviation of the expenditure was $800. Find the approximate probability
that the mean expenditure of 64 students picked at random was
a. More than $7820
b. Between $7800 and $8120

2. The length of life (in hours) of a certain type of electric bulb is a random variable with a mean of
500 hours and a standard deviation of 35 hours. What is the approximate probability that a
random sample of 49 bulbs will have a mean life between 488 and 505 hours?
3. The weight of food packed in certain containers is a random variable with a mean weight of 16
ounces and a standard deviation of 0.6 ounces. If the containers are shipped in boxes of 36, find,
approximately, the probability that a randomly picked box will weigh over 585 ounces.

4. The amount of time (in minutes) devoted to commercials on a TV channel during any half-hour
program is a random variable whose mean is 6.3 minutes and whose standard deviation is 0.866
minutes. What is the approximate probability that a person who watches 36 half-hour programs
will be exposed to over 220 minutes of commercials?

Sampling Distribution of the Proportion


If X is the number of items having a certain attribute in a sample of size n, then “the sample
proportion having the attribute” is X/n. The probability distribution of this statistic is called the sampling
distribution of the proportion.

For example, we wish to arrive at a conclusion concerning the proportion of coffee-drinkers in


Metro Manila who prefer a certain brand of coffee. It would be impossible to question every coffee
drinker in Metro Manila in order to compute the value of the parameter p representing the population
proportion. Instead, a large random sample is selected and the proportion 𝑝̂ of people in this sample
favoring the brand of coffee in question is calculated. The value 𝑝̂ is now used to make an inference
concerning the true proportion p.

If n items are picked independently from a population where the probability of success is p(not
very near 0 or 1), and if n is large, then the distribution of the sample proportion X/n is approximately
normal with mean p and variance 𝑝(1 − 𝑝)/𝑛.

X
-p
Converting to the z-scale, we find that
n has a distribution that is very close to the
p(1 - p )
n
standard normal distribution provided 𝑛 is large.

Example. According to the Mendelian Law of segregation in genetics, when certain types of peas are
crossed, the probability that a plant yields a yellow pea is 3/4 and that it yields a green pea is 1/4. For a
plant yielding 400 peas, find the following.
a. The standard error of the proportion of yellow peas
b. The approximate probability that the proportion of yellow peas will be between 0.71 and 0.78

Point Estimation
A point estimate provides a single numerical value as an assessed value of the parameter under
investigation. The procedure for finding a point estimate of a parameter involves the following steps.
1. Pick a random sample of a certain size.
2. Devise a random variable – a statistic.
3. Compute the value of the statistic from the observed sample data.

The statistic used for the purpose of estimation is called an estimator. A numerical value of the
estimator computed from a given set of sample values is called a point estimate of the parameter. Thus a
point estimate is a single number.
For example, we might use the sample mean X ̅ based on four items as an estimate of the
population mean 𝜇. Its values are likely to differ from sample to sample.

Example.
1. The following readings gives the weight of four bags of flour ( in pounds) picked at random from
a shipment: 102, 101, 97, and 96. Find the estimates of the following.
a. The true (population) mean weight of all the bags
b. The true variance of weights of all the bags
c. The true standard deviation of weights

2. Estimate the proportion of people who wear glasses if a random sample of 370 yielded 76
individuals who wore them.

3. A small company wants to know how many neckties it should make for the next year. When 100
people were interviewed, it was found that they had bought 165 ties produced by the company
during the current year. If there are 100,000 people in the region, estimate how many ties the
company should produce.

Criteria for Good Estimators


Many of the basic properties and criteria for characterizing point estimates were formulated and
developed by Sir Ronald A. Fisher (1890 – 1962). They include unbiasedness, efficiency, consistency,
and sufficiency.

The estimator of θ will be written as 𝜃̂, where the hat (^) is read estimator of.

An estimator 𝜃̂ is an unbiased estimator of a parameter θ if its sampling distribution has mean


θ, the parameter being estimated. That is, 𝐸(𝜃̂) = 𝜃.

An estimator that is not unbiased is said to be biased. A biased estimator will, on the average,
either underestimate or overestimate θ.

As the sample size becomes extremely large, we would like the estimator to be close to the
parameter with very high probability, almost to the point of being sure. Estimators of this property are
said to be consistent,

If 𝜃̂1 and 𝜃̂2 are two unbiased estimators of 𝜃1 and 𝜃2 , then 𝜃̂1 is said to be more efficient than
𝜃2 if the variance of the sampling distribution of 𝜃̂1 is less than the variance of the sampling distribution
̂
of 𝜃̂2.

A sufficient estimator should be such that it utilizes all the information contained in the sample
for the purpose of estimating a given parameter.

The sample mean 𝑋̅ is an unbiased and a consistent estimator of 𝜇. Also we find it satisfactory to
use the sample proportion to estimate 𝑝, the sample variance 𝑠 2 to estimate the population variance 𝜎 2 ,
and the sample standard deviation 𝑠 to estimate 𝜎.

Potrebbero piacerti anche