Sei sulla pagina 1di 41

Inferential Statistics

Chapter Eleven
Dr Nek Kamal Yeop Yunus
Faculty of business & economics
Sultan Idris Education University

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Inferential Statistics
Chapter Eleven

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

What are Inferential Statistics?

Refer to certain procedures that allow researchers to


make inferences about a population based on data
obtained from a sample.
Obtaining a random sample is desirable since it
ensures that this sample is representative of a larger
population.
The better a sample represents a population, the
more researchers will be able to make inferences.
Making inferences about populations is what
Inferential Statistics are all about.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Two Samples from Two Distinct


Populations (Figure 11.1)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Sampling Error

It is reasonable to assume that each sample


will give you a fairly accurate picture of its
population.
However, samples are not likely to be identical
to their parent populations.
This difference between a sample and its
population is known as Sampling Error. (see
Figure 11.2)
Furthermore, no two samples will be identical
in all their characteristics.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Sampling Error

McGraw-Hill

(Figure 11.2)

2006 The McGraw-Hill Companies, Inc. All rights

Distribution of Sample Means

There are times where large collections of random


samples do pattern themselves in ways that will
allow researchers to predict accurately some
characteristics of the population from which the
sample was taken.
A sampling distribution of means is a frequency
distribution resulting from plotting the means of a
very large number of samples from the same
population
Refer to Figure 11.3

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

A Sampling Distribution of Means

(Figure

11.3)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Distribution of Sample Means


(Figure 11.4)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Standard Error of the Mean

The standard deviation of a sampling distribution of


means is called the Standard Error of the Mean
(SEM).
If you can accurately estimate the mean and the
standard deviation of the sampling distribution,
you can determine whether it is likely or not that a
particular sample mean could be obtained from the
population.
To estimate the SEM, divide the SD of the sample
by the square root of the sample size minus one.
Refer to Figure 11.4

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Confidence Intervals

A Confidence Interval is a region extending both


above and below a sample statistic within which a
population parameter may be said to fall with a
specified probability of being wrong.
SEMs can be used to determine boundaries or
limits, within which the population mean lies.
If a confidence interval is 95%, there would be a
probability that 5 out of 100 (population mean)
would fall outside the boundaries or limits.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

The 95 percent Confidence Interval


(Figure 11.5)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

The 99 percent Confidence Interval


(Figure 11.6)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

We Can Be 99 percent Confident


(Figure 11.7)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Does a Sample Difference Reflect a


Population Difference? (Figure 11.8)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Distribution of the Difference Between


Sample Means (Figure 11.9)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Confidence Intervals

McGraw-Hill

(Figure 11.10)

2006 The McGraw-Hill Companies, Inc. All rights

Hypothesis Testing

Hypothesis testing is a way of determining the


probability that an obtained sample statistic will occur,
given a hypothetical population parameter.
The Research Hypothesis specifies the predicted
outcome of a study.
The Null Hypothesis typically specifies that there is no
relationship in the population.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Research and Null Hypotheses


(Figure 11.11)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Illustration of When a Researcher Would


Reject the Null Hypothesis (Figure 11.12)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Hypothesis Testing: A Review

State the research hypothesis


State the null hypothesis
Determine the sample statistics pertinent to the
hypothesis
Determine the probability of obtaining the sample
results
If the probability is small, reject the null hypothesis
and affirm the research hypothesis
If the probability is large, do not reject the null
hypothesis and do not affirm the research hypothesis

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Practical vs. Statistical Significance

The terms significance level or level of


significance refers to the probability of a sample
statistic occurring as a result of sampling error.
Significance levels most commonly used in
educational research are the .05 and .01 levels.
Statistical significance and practical significance
are not necessarily the same since a result of
statistical significance does not mean that it is
practically significant in an educational sense.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

One and Two-tailed Tests

A one-tailed test is when the researcher obtains


a positive difference between the sample mean
which will support the hypothesis, when using
only the positive tail of the sampling distribution.
(Figure 11.13)
A two-tailed test involves the use of probabilities
based on both sides of a sampling distribution
because the research hypothesis is a nondirectional hypothesis.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Significance Area for a


One-tailed Test (Figure 11.13)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

One-tailed Test Using a Distribution of


Differences Between Sample Means (Figure
11.14)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Two-tailed Test Using a Distribution of


Differences Between Sample Means (Figure
11.15)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Contingency Coefficient Values for Different-Sized


Crossbreak Tables (Table 11.1)

Size of Table
(No. of Cells)
2 by 2
3 by 3
4 by 4
5 by 5
6 by 6

Upper limita
for C Calculated
.71
.82
.87
.89
.91

The upper limit for unequal-sized tables (such as 2 by 3 or 3 by 4) are unknown but can be estimated
from the values given. Thus, the upper imit for a 3 by 4 table would approximate .85
a

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Commonly Used Inferential Techniques


(Table 11.2)

Quantitative

Parametric

Nonparametric

t-test for independent means

Mann-Whitney U test

t-test for correlated means

Kruskal-Wallis one-way analysis of variance

Analysis of variance (ANOVA)

Sign test

Analysis of covariance (ANCOVA)

Friedman two-way analysis of variance

Multivariate analysis of variance (MANOVA)


t-test for r

Categorical

McGraw-Hill

t-test for difference in proportions

Chi square

2006 The McGraw-Hill Companies, Inc. All rights

Type I vs. Type II Error

A null hypothesis predicts no relationship.


A Type II error results when the researcher
fails to reject the null hypothesis that is false.
A Type I error results when the researcher
rejects the null when it is true.
Figure 11.16 provides an example of Type 1
and Type II errors.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Hypothetical Example of Type I and


Type II Errors (Figure 11.16)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Rejecting the Null Hypothesis


(Figure 11.17)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Inference Techniques

There are two basic types of inference


techniques:
1)

2)

McGraw-Hill

Parametric: makes assumptions about the


nature of the population from which the samples
involved in the research study were taken
Non-parametric: makes few assumptions about
the nature of the population from which the
samples are taken

2006 The McGraw-Hill Companies, Inc. All rights

An Illustration of Power Under an Assumed


Population Value
(Figure 11.18)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Parametric Techniques for Analyzing


Quantitative Data

The t-test is a parametric statistical test used


to see whether a difference between the means
of two samples is significant.
There are two forms of t-tests:
1)
2)

T-test for correlated means


T-test for independent means

Analysis of Variance (ANOVA) is used to


determine if significant differences exist
between two or more groups.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Parametric Techniques for


Analyzing Quantitative Data (cont.)

Analysis of Covariance (ANCOVA) is a variation of


an ANOVA used when groups are given a pretest
related in some way to the dependent variable and
their mean scores on this pretest are found to
differ.
Multivariate Analysis of Variance (MANOVA)
incorporates two or more dependent variables in
the same analysis, thus permitting a more
powerful test of differences among means.
T-test for r is used to see whether a correlation
coefficient calculated on sample data is significant.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Non-Parametric Techniques for


Analyzing Quantitative Data

The Mann-Whitney U test is a nonparametric


alternative to the t test used when a researcher
wishes to analyze ranked data
The Kruskal-Wallis one-way analysis of variance is
used when you have two or more independent
variables to compare
The Sign test is used when you want to analyze two
related samples. Related samples are connected in
some way
The Friedman two-way analysis of variance is used
when two or more related groups are involved

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Techniques for Measuring


Categorical Data

Parametric Technique

T-test for Proportions (finding differences in


proportions within categories)

Non-Parametric Technique

McGraw-Hill

Chi-square test is used to analyze data that


are reported in categories
The Contingency Coefficient is a descriptive
statistic indicating the degree of relationship
that exists between two categorical variables

2006 The McGraw-Hill Companies, Inc. All rights

Power of a Statistical Test

Power is the probability that the test will


correctly lead to the conclusion that there is
a difference when, it fact, a difference
exists.
Parametric tests are generally, but not
always, more powerful than non-parametric
tests.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

A Power Curve

McGraw-Hill

(Figure 11.19)

2006 The McGraw-Hill Companies, Inc. All rights

Any questions?

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Thank You

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Potrebbero piacerti anche