Statistics Quality Assurance Survey Methodology Statistical Population

A.) Sampling is the process of selecting units (e.g.
, people, organizations) from a population of interest so that

by studying the sample we may fairly generalize our results back to the population from which they were
chosen. In statistics, quality assurance, & survey methodology, sampling is concerned with
the selection of a subset of individuals from within a statistical population to estimate
characteristics of the whole population.
B.) TYPES OF SAMPLING
Nonprobability sampling is any sampling method where some elements of the population
have no chance of selection (these are sometimes referred to as 'out of
coverage'/'undercovered'), or where the probability of selection can't be accurately
determined. It involves the selection of elements based on assumptions regarding the
population of interest, which forms the criteria for selection. Hence, because the selection of
elements is nonrandom, nonprobability sampling does not allow the estimation of sampling
errors. These conditions give rise to exclusion bias, placing limits on how much information a
sample can provide about the population. Information about the relationship between sample
and population is limited, making it difficult to extrapolate from the sample to the population.
2.)Quota sampling is designed to overcome the most obvious flaw of availability sampling.
Rather than taking just anyone, you set quotas to ensure that the sample you get represents
certain characteristics in proportion to their prevalence in the population. Note that for this
method, you have to know something about the characteristics of the population ahead of
time. Say you want to make sure you have a sample proportional to the population in terms
of gender - you have to know what percentage of the population is male and female, then
collect sample until yours matches. Marketing studies are particularly fond of this form of
research design.
3.) Purposive sampling is a sampling method in which elements are chosen based on
purpose of the study. Purposive sampling may involve studying the entire population of some
limited group (sociology faculty at Columbia) or a subset of a population (Columbia faculty
who have won Nobel Prizes). As with other non-probability sampling methods, purposive
sampling does not produce a sample that is representative of a larger population, but it can
be exactly what is needed in some cases - study of organization, community, or some other
clearly defined and relatively limited group.
A probability sample is a sample in which every unit in the population has a chance
(greater than zero) of being selected in the sample, and this probability can be accurately
determined. The combination of these traits makes it possible to produce unbiased estimates
of population totals, by weighting sampled units according to their probability of selection.
1.) In a simple random sample (SRS) of a given size, all such subsets of the frame are given
an equal probability. Furthermore, any given pair of elements has the same chance of
selection as any other such pair (and similarly for triples, and so on). This minimises bias
and simplifies analysis of results. In particular, the variance between individual results
within the sample is a good indicator of variance in the overall population, which makes it
relatively easy to estimate the accuracy of results.
2.) Stratified Random Sampling
In this form of sampling, the population is first divided into two or more mutually
exclusive segments based on some categories of variables of interest in the
research. It is designed to organize the population into homogenous subsets before
sampling, then drawing a random sample within each subset. With stratified random
sampling the population of N units is divided into subpopulations of units respectively.
These subpopulations, called strata, are non-overlapping and together they comprise the
whole of the population. When these have been determined, a sample is drawn from
each, with a separate draw for each of the different strata. The sample sizes within the
strata are denoted by respectively. If a SRS is taken within each stratum, then the whole
sampling procedure is described as stratified random sampling.
3.)Cluster Sampling
In some instances the sampling unit consists of a group or cluster of smaller units
that we call elements or subunits (these are the units of analysis for your study).
There are two main reasons for the widespread application of cluster sampling.
Although the first intention may be to use the elements as sampling units, it is found
in many surveys that no reliable list of elements in the population is available and
that it would be prohibitively expensive to construct such a list. In many countries
there are no complete and updated lists of the people, the houses or the farms in
any large geographical region.
4.) Random number tables have been used in statistics for tasks such as
selected random samples. This was much more effective than manually selecting the random
samples (with dice, cards, etc.). Nowadays, tables of random numbers have been replaced by
computational random number generators.
If carefully prepared, the filtering and testing processes remove any noticeable bias or asymmetry
from the hardware-generated original numbers so that such tables provide the most "reliable"
random numbers available to the casual user.
What is Hypothesis Testing?
A statistical hypothesis is an assumption about a population parameter. This

assumption may or may not be true. Hypothesis testing refers to the formal
procedures used by statisticians to accept or reject statistical hypotheses.
Statistical Hypotheses
The best way to determine whether a statistical hypothesis is true would be to examine
the entire population. Since that is often impractical, researchers typically examine a
random sample from the population. If sample data are not consistent with the
statistical hypothesis, the hypothesis is rejected.
There are two types of statistical hypotheses.

Null hypothesis. The null hypothesis, denoted by H0, is
usually the hypothesis that sample observations result purely
from chance.
Alternative hypothesis. The alternative hypothesis, denoted

by H1 or Ha, is the hypothesis that sample observations are
influenced by some non-random cause.
Parametric Hypothesis tests are frequently used to measure the quality of sample parameters or to test whether
estimates on a given parameter are equal for two samples.
Parametric Hypothesis tests set up a null hypothesis against an alternative hypothesis, testing, for instance,
whether or not the population mean is equal to a certain value, and then using appropriate statistics to
calculate the probability that the null hypothesis is true. You can then reject or accept the null hypothesis
based on the calculated probability.
T-Test (Independent Samples)
Definition
A t-test helps you compare whether two groups have different average values (for
example, whether men and women have different average heights).
The independent-samples t-test (or independent t-test, for short) compares the means between
two unrelated groups on the same continuous, dependent variable. For example, you could use
an independent t-test to understand whether first year graduate salaries differed based on
gender (i.e., your dependent variable would be "first year graduate salaries" and your
independent variable would be "gender", which has two groups: "male" and "female").
Alternately, you could use an independent t-test to understand whether there is a difference in
test anxiety based on educational level (i.e., your dependent variable would be "test anxiety"
and your independent variable would be "educational level", which has two groups:
"undergraduates" and "postgraduates").
A correlated-samples, or dependent-samples t-Test, which is being introduced here, is used

when you have one sample of subjects who are tested several times but under different
conditions. That is, each subject is measured on the same dependent variable, but under different
levels of an independent variable. You compare performance of the subjects between the
different levels of this independent variable (with-subjects design).
A correlated-groups t-Test, or more generally a within-subjects design, is preferred over the
independent groups t-Test (between-subjects designs), whenever it is practical to use. The
preference is based on the fact that the correlated-group t-Test and all other inferential tests
designed to analyze within-subjects designs, is more statistically powerful then the independent-
groups t-Test and all inferential tests that are designed to analyze between-subjects designs.
What this 'power issue' refers to is that you are more likely to produce a statistically significant
result and reject your null hypothesis with a correlated-groups t-Test (within-subject analyses)
than with an independent-groups t-Test (between-subjects analyses). The reason for this
additional power is not based on the mean difference between conditions/samples; rather, it is
based on the standard error term in the t-Test (the denominator). If you were to take any set of
data that includes two conditions or levels of an independent variable, holding everything
constant the mean difference between those two conditions will be the same regardless of
whether you use an independent-groups t-Test or a correlated-groups t-Test. This is because
you will have exactly the same values in each condition; hence, the means for each condition
will be the same. However, no matter what, the error term will always be smaller for a
correlated-groups t-Test than for an independent-groups t-Test. This is what makes the
correlated-groups t-Test (and all within-subject design analyses) more statistically powerful.
Because it has a smaller error term, the obtained t-Value will always be larger.
Independent Events
Two events are independent if the occurrence of one does not change the probability
of the other occurring.
An example would be rolling a 2 on a die and flipping a head on a coin. Rolling the 2
does not affect the probability of flipping the head.
If events are independent, then the probability of them both occurring is the product of
the probabilities of each occurring.
A Z-test is any statistical test for which the distribution of the test statistic under the null
hypothesis can be approximated by a normal distribution. Because of the central limit theorem, many
test statistics are approximately normally distributed for large samples. For each significance level,
the Z-test has a single critical value (for example, 1.96 for 5% two tailed) which makes it more
convenient than the Student's t-test which has separate critical values for each sample size.
Therefore, many statistical tests can be conveniently performed as approximate Z-tests if the sample
size is large or the population variance known. If the population variance is unknown (and therefore
has to be estimated from the sample itself) and the sample size is not large (n < 30), the Student's t-
test may be more appropriate.
Correlation is a statistical technique that can show whether and how strongly pairs of variables are
related. For example, height and weight are related; taller people tend to be heavier than shorter people.
The relationship isn't perfect. People of the same height vary in weight, and you can easily think of two
people you know where the shorter one is heavier than the taller one. Nonetheless, the average weight of
people 5'5'' is less than the average weight of people 5'6'', and their average weight is less than that of
people 5'7'', etc. Correlation can tell you just how much of the variation in peoples' weights is related to
their heights. Although this correlation is fairly obvious your data may contain unsuspected correlations.
You may also suspect there are correlations, but don't know which are the strongest. An intelligent
correlation analysis can lead to a greater understanding of your data.
An F-test is any statistical test in which the test statistic has an F-distribution under the null
hypothesis. It is most often used when comparing statistical models that have been fitted to
a data set, in order to identify the model that best fits the population from which the data were
sampled. Exact "F-tests" mainly arise when the models have been fitted to the data using least
squares. The name was coined by George W. Snedecor, in honour of Sir Ronald A. Fisher. Fisher
initially developed the statistic as the variance ratio in the 1920s. [1]
Multiple-comparison ANOVA problems[edit]
The F-test in one-way analysis of variance is used to assess whether the expected values of a
quantitative variable within several pre-defined groups differ from each other. For example, suppose
that a medical trial compares four treatments. The ANOVA F-test can be used to assess whether any
of the treatments is on average superior, or inferior, to the others versus the null hypothesis that all
four treatments yield the same mean response. This is an example of an "omnibus" test, meaning
that a single test is performed to detect any of several possible differences. Alternatively, we could
carry out pairwise tests among the treatments (for instance, in the medical trial example with four
treatments we could carry out six tests among pairs of treatments). The advantage of the ANOVA F-
test is that we do not need to pre-specify which treatments are to be compared, and we do not need
to adjust for making multiple comparisons. The disadvantage of the ANOVA F-test is that if we reject
the null hypothesis, we do not know which treatments can be said to be significantly different from
the others if the F-test is performed at level we cannot state that the treatment pair with the
greatest mean difference is significantly different at level .
The formula for the one-way ANOVA F-test statistic is
or
The "explained variance", or "between-group variability" is

where denotes the sample mean in the ith group, ni is the number of observations in
the ith group, denotes the overall mean of the data, and K denotes the number of
groups.
The "unexplained variance", or "within-group variability" is
where Yij is the jth observation in the ith out of K groups and N is the overall sample
size. This F-statistic follows the F-distribution with K1, N K degrees of freedom
under the null hypothesis. The statistic will be large if the between-group variability
is large relative to the within-group variability, which is unlikely to happen if
the population means of the groups all have the same value.
Nonparametric tests are useful for testing whether group means or medians are distributed the same across groups.
In these types of tests, we rank (or place in order) each observation from our data set. Nonparametric tests are widely
used when you do not know whether your data follows normal distribution, or you have confirmed that your data do
not follow normal distribution. Meanwhile, hypothesis tests are parametric tests based on the assumption that the
population follows a normal distribution with a set of parameters.
Kruskal-Wallis ANOVA
Kruskal-Wallis (K-W) ANOVA is a nonparametric alternative to the one-way analysis of variance (ANOVA) test. The K-
W ANOVA uses rank sums to determine whether three or more independent samples are taken from the same
distribution (when comparing two samples, the Mann-Whitney test is more often used). When K-W test results are
significant, post-hoc tests between pairs of samples can be used to determine which pairs show significant
differences.
The One-Sample Wilcoxon Signed Rank Test is a nonparametric alternative to a one-sample t-test. The test
determines whether the median of the sample is equal to some specified value. Data should be distributed
symmetrically about the median.
Paired-Sample Wilcoxon Signed Rank Test

The Paired-Sample Wilcoxon Signed Rank Test is a nonparametric alternative method to the paired sample t-test.
Paired samples are presumed to be drawn, at random, from a single population. Differences between paired samples
are assumed to be distributed symmetrically about the median.
Test Statistic for the Sign Test

The test statistic for the Sign Test is the number of positive signs or number of negative signs,
whichever is smaller. In this example, we observe 2 negative and 6 positive signs. Is this
evidence of significant improvement or simply due to chance?
Determining whether the observed test statistic supports the null or research hypothesis is done
following the same approach used in parametric testing. Specifically, we determine a critical
value such that if the smaller of the number of positive or negative signs is less than or equal to
that critical value, then we reject H0 in favor of H1 and if the smaller of the number of positive or
negative signs is greater than the critical value, then we do not reject H0. Notice that this is a
one-sided decision rule corresponding to our one-sided research hypothesis (the two-sided
situation is discussed in the next example).
Table of Critical Values for the Sign Test

The critical values for the Sign Test are in the table below.
To determine the appropriate critical value we need the sample size, which is equal to the
number of matched pairs (n=8) and our one-sided level of significance =0.05. For this
example, the critical value is 1, and the decision rule is to reject H0 if the smaller of the number
of positive or negative signs < 1. We do not reject H0 because 2 > 1. We do not have sufficient
evidence at =0.05 to show that there is improvement in repetitive behavior after taking the drug
as compared to before. In essence, we could use the critical value to decide whether to reject
the null hypothesis. Another alternative would be to calculate the p-value, as described below.
Computing P-values for the Sign Test

With the Sign test we can readily compute a p-value based on our observed test statistic. The
test statistic for the Sign Test is the smaller of the number of positive or negative signs and it
follows a binomial distribution with n = the number of subjects in the study and p=0.5 (See the
module on Probability for details on the binomial distribution). In the example above, n=8 and
p=0.5 (the probability of success under H0).
By using the binomial distribution formula:
Tests with More than Two Independent Samples

In the modules on hypothesis testing we presented techniques for testing the equality of means
in more than two independent samples using analysis of variance (ANOVA). An underlying
assumption for appropriate use of ANOVA was that the continuous outcome was approximately
normally distributed or that the samples were sufficiently large (usually nj> 30, where j=1, 2, ..., k
and k denotes the number of independent comparison groups). An additional assumption for
appropriate use of ANOVA is equality of variances in the k comparison groups. ANOVA is
generally robust when the sample sizes are small but equal. When the outcome is not normally
distributed and the samples are small, a nonparametric test is appropriate.
The Kruskal-Wallis Test

A popular nonparametric test to compare outcomes among more than two independent groups
is the Kruskal Wallis test. The Kruskal Wallis test is used to compare medians among k
comparison groups (k > 2) and is sometimes described as an ANOVA with the data replaced by
their ranks. The null and research hypotheses for the Kruskal Wallis nonparametric test are
stated as follows:
H0: The k population medians are equal versus
H1: The k population medians are not all equal
The procedure for the test involves pooling the observations from the k samples into one
combined sample, keeping track of which sample each observation comes from, and then
ranking lowest to highest from 1 to N, where N = n1+n2 + ...+ nk. To illustrate the procedure,
consider the following example.
Wilcoxon Signed Rank Test

Use: To compare a continuous outcome in two matched or paired samples.
Null Hypothesis: H0: Median difference is zero
Test Statistic: The test statistic is W, defined as the smaller of W+ and W- which are
the sums of the positive and negative ranks of the difference scores, respectively.
Decision Rule: Reject H0 if W < critical value from table.
Kruskal Wallis Test

Use: To compare a continuous outcome in more than two independent samples.
Null Hypothesis: H0: k population medians are equal
Test Statistic: The test statistic is H,
where k=the number of comparison groups, N= the total sample size, nj is the sample
size in the jth group and Rj is the sum of the ranks in the jth group.
Decision Rule: Reject H0 if H > critical value

In simple linear regression, we predict scores on one variable from
the scores on a second variable. The variable we are predicting is
called the criterion variable and is referred to as Y. The variable we
are basing our predictions on is called the predictor variable and is
referred to as X. When there is only one predictor variable, the
prediction method is called simple regression. In simple linear
regression, the topic of this section, the predictions of Y when
plotted as a function of X form a straight line.
The example data in Table 1 are plotted in Figure 1. You can see
that there is a positive relationship between X and Y. If you were
going to predict Y from X, the higher the value of X, the higher your
prediction of Y.
Table 1. Example data.
X Y
1.00 1.00
2.00 2.00
3.00 1.30
4.00 3.75
5.00 2.25
Figure 1. A scatter plot of the example data.
Linear regression consists of finding the best-fitting straight line

through the points. The best-fitting line is called a regression line.
The black diagonal line in Figure 2 is the regression line and consists
of the predicted score on Y for each possible value of X. The vertical
lines from the points to the regression line represent the errors of
prediction. As you can see, the red point is very near the regression
line; its error of prediction is small. By contrast, the yellow point is
much higher than the regression line and therefore its error of
prediction is large.
Figure 2. A scatter plot of the example data. The black line consists of
the predictions, the points are the actual data, and the vertical lines
between the points and the black line represent errors of prediction.

Statistics Quality Assurance Survey Methodology Statistical Population

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Statistics Quality Assurance Survey Methodology Statistical Population

Caricato da

Copyright:

Formati disponibili

A.) Sampling is the process of selecting units (e.g.

, people, organizations) from a population of interest so that

What is Hypothesis Testing?

A statistical hypothesis is an assumption about a population parameter. This

There are two types of statistical hypotheses.

Alternative hypothesis. The alternative hypothesis, denoted

T-Test (Independent Samples)

A correlated-samples, or dependent-samples t-Test, which is being introduced here, is used

Multiple-comparison ANOVA problems[edit]

The formula for the one-way ANOVA F-test statistic is

The "explained variance", or "between-group variability" is

The "unexplained variance", or "within-group variability" is

Paired-Sample Wilcoxon Signed Rank Test

Test Statistic for the Sign Test

Table of Critical Values for the Sign Test

Computing P-values for the Sign Test

By using the binomial distribution formula:

Tests with More than Two Independent Samples

The Kruskal-Wallis Test

H0: The k population medians are equal versus

H1: The k population medians are not all equal

Wilcoxon Signed Rank Test

Null Hypothesis: H0: Median difference is zero

Decision Rule: Reject H0 if W < critical value from table.

Kruskal Wallis Test

Null Hypothesis: H0: k population medians are equal

Test Statistic: The test statistic is H,

Decision Rule: Reject H0 if H > critical value

Linear regression consists of finding the best-fitting straight line

Potrebbero piacerti anche