Sei sulla pagina 1di 71

Introduction to Hypothesis

Tests

Assistant Prof. Dr. zgr Tosun


Normal Distribution

Karl F. Gauss
Abraham de Moivre (1667-
(1777-1855)
1754)
NORMAL DISTRIBUTION
Normal (Gaussian) distribution is the most famous
probability distribution of continuous variables.

The two parameters of the normal distribution are the


mean () and the standard deviation ().

The graph has a familiar bell-shaped curve.

The function of normal distribution curve is as follows:

2
1 xi
1
2

f ( x) e xi
2
The normal distribution is completely defined
by the mean and standard deviation of a set of
quantitative data:
The mean determines the location of the
curve on the x axis of a graph
The standard deviation determines the
height of the curve on the y axis

There are an infinite number of normal


distributions- one for every possible combination
of a mean and standard deviation
Examples of Normal
Distributions

Pr(X) on the y-axis refers to either frequency or probability.


Examples of Normal
Distributions
25

20
Frequency

Frequency
15

10

0 //
55 60 65 70 75 80 85 90
Mean Heart Rate (BPM)

Many (but not all) continuous variables are


approximately normally distributed. Generally, as
sample size increases, the shape of a frequency
distribution becomes more normally distributed.
When data are normally distributed, the mode,
median, and mean are identical and are located at
the center of the distribution.

M od e, M ed ian, M ean
Frequency
of
occurrence
Quantitative variables may also have a
skewed distribution:
When distributions are skewed, they have
more extreme values in one direction than the
other, resulting in a long tail on one side of the
distribution.
The direction of the tail determines whether a
distribution is positively or negatively skewed.
A positively skewed distribution has a long tail
on the right, or positive side of the curve.
A negatively skewed distribution has the tail
on the left, or negative side of the curve.
For a normally distributed variable:
~68.3% of the observations lie between the mean and 1
standard deviation
~95.4% lie between the mean and 2 standard deviations
~99.7% lie between the mean and 3 standard deviations

Mode,
Median,
Mean

68.3 %

95.4 %
99.7 %


68.26%

95.44%

99.74%

3 2 2 3

P ( x ) 0.6826
P( 2 x 2 ) 0.9544
P( 3 x 3 ) 0.9974
Sex HR Sex HR Sex HR Sex HR Sex HR Sex HR Sex HR For the heart rate data for 84
F
M
55
57
M
F
66
67
F
F
70
70
M
M
73
73
F
F
77
77
M
M
79
79
F
M
82
82
adults:
M
F
59
61
F
F
67
68
M
M
70
70
M
M
73
73
F
M
77
77
M
F
79
80
F
M
83
83
Mean HR = 74.0 bpm
M
M
61
62
F
F
68
68
F
F
71
71
F
F
74
74
M
F
77
78
F
M
80
80
M
F
83
84
SD = 7.5 bpm
M 62 M 68 M 71 F 74 F 78 F 81 F 84
F 63 F 69 M 71 M 74 F 78 F 81 M 85
F 64 M 69 F 72 F 75 F 78 F 81 F 86
M
M
64
64
M
M
69
69
M
F
72
73
F
M
75
75
M
M
78
78
M
F
81
82
F
M
86
89
Mean 1SD = 74.0
M 66 F 70 M 73 M 76 M 79 F 82 M 89 7.5
= 66.5-81.5 bpm
25

20

Mean 2SD = 74.0


F req u en cy

15

10
15.0
5 = 59.0-89.0 bpm
0 //
55 60 65 70 75 80 85 90

Mean 3SD = 74.0


Heart Rate (BPM)

22.5
= 51.5-96.5 bpm
HR Data:
57/84 (67.9%) subjects are between mean 1SD
82/84 (97.6%) are between mean 2SD
84/84 (100%) are between mean 3SD

100

95 +3 SD

90 +2 SD
85
+ 1SD
Heart rate (bpm)

80

75
Mean
70
-1 SD
65

60 -2 SD
55
-3 SD
50

45
0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84

Subject number
The normal range in medical measurements is the central 95%
of the values for a reference population, and is usually determined
from large samples representative of the population.
The central 95% is approximately the mean 2 sd*

Some examples of established reference ranges are:

Serum Normal range


fasting glucose 70-110 mg/dL
sodium 135-146 mEq/L
triglycerides 35-160 mg/dL

Note: The value is actually 1.96 sd but for convenience this


is usually rounded to 2 sd.
Back to Todays Lecture

BASICS OF
HYPOTHESIS
TESTING
An Example to Start with
An imaginary company called Chose
Your Baby provided a product called
Gender Choice
The claim is: increase your chances of
having
a boy up to 85%,
a girl up to 80%"
Suppose an experiment with 100
couples who want to have baby girls,
and they all follow the Gender Choice
directions in the pink package
An Example to Start with
For the purpose of testing the claim of an
increased likelihood for girls, we will
assume that Gender Choice has no
effect
Using common sense and no formal
statistical methods, what should we
conclude about the assumption of no effect
from Gender Choice if 100 couples using
Gender Choice have 100 babies consisting
of
52 girls?
97 girls?
An Example to Start with
normally around 50 girls in 100 births
52 girls is close to 50, so we should not
conclude that the Gender Choice product
is effective
If 100 couples used no special method,
the result of 52 girls could easily occur by
chance
The assumption of no effect from Gender
Choice appears to be correct
There isnt sufficient evidence to say that
Gender Choice is effective
An Example to Start with
97 girls in 100 births is extremely unlikely to
occur by chance
We could explain the occurrence of 97 girls
in one of two ways:
Either an extremely rare event has occurred by
chance, or
Gender Choice is effective.
The extremely low probability of getting 97
girls is strong evidence against the
assumption that Gender Choice has no
effect
It does appear to be effective
What is a Hypothesis
The word hypothesis is just slightly
technical or mathematical term for
sentence or claim or
statement
A statement that something is true
concerning the population
In statistics, a hypothesis is always
a statement about the value of one
or more population parameter(s)
Parameter vs Statistic
Parameter is
a summary value which in some way characterizes
the nature of the population in the variable
under study
If the measures are computed for data from a
population, they are called population parameters

Statistic is
a summary value calculated from a sample of
observation
If the measures are computed for data from a
sample, they are called sample statistics
Parameter vs Statistic

POPULATION SAMPLE

Parameter Statistic
Number of cases N n
Standard Deviation S, sd

Variance S2, sd2

Arithmetic Mean X
Hypothesis Test
is a process that uses sample
statistics to test a claim about
the value of a population
parameter
purpose of hypothesis testing is
to determine whether there is
enough statistical evidence in
favor of a certain belief about a
population parameter
is a standard procedure for
More on Hypothesis
H subzero or H naught
A null hypothesis H0 is a statistical hypothesis
that contains a statement of equality such as ,
=, or

H sub-a

An alternative hypothesis Ha (H1) is the


complement of the null hypothesis. It is a
statement that must be true if H0 is false and
contains a statement of inequality such as >, ,
or <
More on Hypothesis
To write the null and alternative
hypotheses, translate the claim
made about the population
parameter from a verbal
statement to a mathematical
statement
Example:
Write the claim as a mathematical sentence.
State the null and alternative hypotheses and
identify which represents the claim.

A drug company claims that their new


medication will effectively control the
blood pressure of hypertensive patients at
a success rate higher than 99.95 %
99.95 %

Condition of
equality
H0: 99.95 % (Claim)

Ha: < 99.95 %

Complement of
the null
hypothesis
Example:
Write the claim as a mathematical sentence.
State the null and alternative hypotheses and
identify which represents the claim.

NEU claims that 94% of their graduates find


employment within six months of graduation.
p = 0.94

Condition of
equality
H0: p = 0.94 (Claim)

Ha: p 0.94

Complement of
the null
hypothesis
One Sided vs Two Sided Ha
The null and alternative
hypotheses are complementary.
two alternatives together cover
all possibilities of the values that
the hypothesized parameter can
assume
Two-sided One-sided
H0: = 0 H0: = 0 H0: = 0
Ha: 0 Ha: > 0 Ha: < 0
Typical statistical hypotheses are:
>5 cm P0.65
..
2>2.00 1-2>0

The following are not statistical hypothesis


x > 5 because x is not a population
parameter (it is a sample
statistic)
Although it is a statement
is big enough about the population
parameter , the statement
is not quantitative
HYPOTHESIS
TESTING
is the operation of deciding
whether or not data obtained for
a random sample succeeds or
fails to support a particular
hypothesis
At the conclusion of the hypothesis
test, we will reach one of two
possible decisions:

We will decide in agreement with


null hypothesis and say that we fail
to reject H0
OR
We will decide in opposition to null
hypothesis and say that we reject
Types of Errors
No matter which hypothesis represents the
claim, always begin the hypothesis test
assuming that the null hypothesis is true.
At the end of the test, one of two decisions
will be made:
1. reject the null hypothesis, or
2. fail to reject the null hypothesis.
A type I error occurs if the null hypothesis
is rejected when it is true.
A type II error occurs if the null hypothesis
is not rejected when it is false.
There are four possible outcomes that could be
reached as a result of the null hypothesis being either
true or false and the decision being either fail to
reject or reject.

Null Hypothes is
Decisio True False

n
Accept H0 Correct Decision Type II Error
(1- )
Reject H0 Type I Error Correct Decision
(1- )
= P(commit a Type I error) = P(reject H0 given that H0 is true)
= P(commit a Type II error) = P(accept H0 given that H0 is false)
We want to keep both and as small as
possible. The value of is controlled by the
experimenter and is called the significance
level
Generally, with everything else held
constant, decreasing one type of error causes
the other to increase
Balance Between and
The only way to decrease both types of
error simultaneously is to increase the
sample size.
No matter what decision is reached,
there is always the risk of one of these
errors.
Balance: identify the largest
significance level as the maximum
tolerable risk you want to have of
making a type I error. Employ a test
procedure that makes type II error as
small as possible while maintaining
type I error smaller than the given
Significance Level
denoted by
the probability that the test statistic
will fall in the critical region when the
null hypothesis is actually true.
common choices are 0.05, 0.01, and
0.10
p-value
A p-value, or probability value, is
the value that represents the
probability of selecting a
sample at least as extreme as
the observed sample
is a measure of inconsistency
between the hypothesized
value under the null hypothesis
and the observed sample
p-value
Itmeasures whether the test
statistic is likely or unlikely,
assuming H0 is true.
Small p-values suggest that the
null hypothesis is unlikely to be
true
The smaller it is, the more convincing is
the rejection of the null hypothesis.
It indicates the strength of evidence for
rejecting the null hypothesis H0
In Other Words
A small p value indicates that the
observed result is unlikely by
chance (therefore statistically
significant) and provides evidence to
reject the null hypothesis
A large p value indicates that the
sample result is not unusual,
therefore not statistically significant - or
that it could easily occur by chance,
which tells us to NOT reject the null
hypothesis
In Other Words
We use our data to calculate the
probability that our finding is
just due to chance, under the
null hypothesis.
This is called the p value.
If the p value is small enough, we
will reject the null hypothesis and
conclude there is a difference.
How small is small enough?
How Small???
A decision as to whether H0
should be rejected
results from comparing the p
value to the chosen significance
level :
H0 should be rejected if p-value
<
H0 should not be rejected if p-
value >
The total area under the normal distribution curve is
1:
90% of the area is between 1.645 std dev
95% of the area is between 1.960 std dev
99% of the area is between 2.575 std dev

Area = 90%
Area = 95%
Area = 99%

-2.575 -1.645 0 +1.645 +2.575


-1.960 +1.960
The Normal Distribution &
Confidence Intervals
90% of the area is between 1.645 std dev
95% of the area is between 1.960
std dev
99% of the area is between 2.575 std dev
These are the most commonly used areas for
defining
Confidence Intervals
which are used in inferential statistics to
estimate population values from sample data
If a certain interval is a 95% confidence interval, then we can say
that if we repeated the procedure of drawing random samples and
computing confidence intervals over and over again, 95% of those
confidence intervals include the true value from the population.
Using p value to Make a
Decision
Example:
The p value for a hypothesis test is p =
0.0256.
What is your decision if the level of
significance is
a.) 0.05,

b.) Because
a.) 0.01? 0.0256 is < 0.05, you should reject the
null hypothesis.

b.) Because 0.0256 is > 0.01, you should fail to


reject the null hypothesis.
When p value>, state fail to
reject H0 or not to reject
rather than accepting H0

Write there is insufficient


evidence to reject H0
Statistical software automates the
task of calculating the test statistic
and its p value
You must still decide which test is
appropriate and whether to use a
one-sided or two-sided test
You must also decide what conclusion
the computers numbers support
PARAMETRIC
vs
NONPARAMETRIC
Hypothesis
Testing

Parametric Tests Nonparametric Tests

Sampling should be random !

Population should be distributed No assumption on the distribution of


normally. the population.
Variables should be continuous. No assumption on the type of the
variable.
# of observations should be greater
than 10. No assumption on the # of
observations.
Parametric vs
Nonparametric
Nonparametric tests are used as an alternative for
parametric tests
Usually are used when the distribution of
underlying population is nonnormal
Even the underlying population is normally
distributed, nonparametric tests are used when the
sample size is small (n<10)
While parametric tests are used to test the
hypothesis based on population mean, proportion
and standard deviation
Nonparametric tests are used to test the hypothesis
based on median or distribution of samples
Parametric statistics (Parametric
tests)

Parametric statistics are


appropriate if quantitative data
is used and the data satisfy the
statistical assumptions of the
specific test

A common assumption is that


each outcome is normally
distributed (bell shaped)
Parametric Tests
There are two main advantages
of using parametric statistics:
Parametric statistics are powerful
analyses in the sense that they
exploit all the information available
in the continuous data.
Parametric statistics can be applied
to many complex designs. It
enables analyses of interaction
effects, as well as the simultaneous
analyses of multiple dependent
variables.
Nonparametric statistics
(Nonparametric tests)
Nonparametric statistics do not rely on
normal distribution and may be
appropriate if our data are nominal or
ordinal level, the sample size is small, or
the scores are not normally distributed.
Nonparametric statistics are less
powerful than parametric statistics
because information is lost in the analysis.
That is, continuous scores are replaced
with rank scores, which provide less
information. This in effect reduces the
sensitivity of detecting changes in data.
Parametric versus
Nonparametric
Parametric tests are based on assumptions
about the distribution of the underlying
population from which the sample was
taken. The most common parametric
assumption is that data are
approximately normally distributed.
Nonparametric tests do not rely on
assumptions about the shape or
parameters of the underlying population
distribution.
If the data deviate strongly from the
assumptions of a parametric procedure,
using the parametric procedure could lead to
incorrect conclusions.
Parametric versus
Nonparametric
You should be aware of the assumptions
associated with a parametric procedure and
should learn methods to evaluate the validity
of those assumptions.
If you determine that the assumptions of the
parametric procedure are not valid, use an
analogous nonparametric procedure instead.
The parametric assumption of normality is
particularly worrisome for small sample
sizes. Nonparametric tests are often a good
option for these data.
Parametric versus
Nonparametric
Itcan be difficult to decide whether to use a
parametric or nonparametric procedure in
some cases.
Visit a statistician if you are in doubt about
whether parametric or nonparametric
procedures are more appropriate for your
data.
Independent vs Dependent
(Unpaired vs Paired )
Two samples are said to be paired when each
data point in the first sample is matched and is
related to a unique data point in the second
sample
The paired samples may represent at least two
sets of measurements on the same people. In
this case each person is serving as his or her
own control
Samples (groups) are independent when the
objects in each of them are different individuals
and the measurements in one group do not
affect the measurements of other groups
Examples
Few Steps to Systemize
Hypothesis Tests Selection
In following few slides, a simple
but effective strategy to select
appropriate statistical hypothesis
testing method is given
Remember ! These steps include
only basic and most common
methods and selection must be
carefully done without violating
any critical assumptions about
the data
Parametric Assumptions
1. Check if every group has at least 10
objects
2. Check if the number of objects are close
to each other for every study group
3. Be sure that the variable being tested
between the groups is continuous
4. Do normality test for each group and be
sure that each of the are normally
distributed
5. Do the homogeneity of variances test and
see if the groups have homogenous
variances
DATA

CONTINUOUS CATEGORICAL
CONTINUOUS

ONE SAMPLE TWO SAMPLES >2 SAMPLES


CONTINUOUS

ONE SAMPLE TWO SAMPLES >2 SAMPLES

Independent Paired Independent Paired


CONTINUOUS

ONE SAMPLE TWO SAMPLES >2 SAMPLES

Independent Paired Independent Paired

One Sample Students Paired Samples One Way Repeated


t Test t Test t Test Analysis Measures
Of Variance ANOVA

Kruskal Wallis
Mann Whitney Wilcoxon Signed Friedman
Analysis of
Sign Test U Test Rank Test Test
Variance

Parametric Nonparametric
CATEGORICAL

ONE SAMPLE TWO SAMPLES >2 SAMPLES


CATEGORICAL

ONE SAMPLE TWO SAMPLES >2 SAMPLES

Independent Paired Independent


CATEGORICAL

ONE SAMPLE TWO SAMPLES >2 SAMPLES

Independent Paired Independent


One sample difference
of proportions test
2 x 2 Chi
Square Test
Mc Nemar
N x M Chi
One Sample Square Test
Test
Chi Square
Fishers
Test
Exact Test

Parametric Nonparametric

Potrebbero piacerti anche