Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Copyright Notice
Biostatistics I
All materials found on GCSoM’s course and
project sites may be subject to copyright
protection, and may be restricted from
Dr. Mushfiq Tarafder further dissemination, retention or copying.
September 7, 2017
Learning Objectives
DISCLOSURE
• Discuss the role of statistical tests in
epidemiologic research
The presenter of this material has no current • Discuss and interpret p value and confidence
interval
financial relationship with a commercial • Describe elementary properties of probability
entity relevant to this educational material.
• Discuss joint and conditional probabilities
• Explain addition and multiplication rules of
probability
• Discuss Bayesian probability and decision‐making
Statistical Measures of Association Population and Sample
• The P value
– Statistical significance tests
• Test hypotheses about the population
• Confidence interval
(Fletcher et al. Clinical Epidemiology, 5th ed.)
1
8/4/2017
Statistical Significance Tests Alpha, Beta, and Power
• Decide whether to reject or fail to reject a null
hypothesis Truth in the Population
Association No Association
– Null hypothesis
Association Correct (1 – beta) Alpha (Type I error)
• there is no association between the exposure and disease
True positive False positive
Results in the
• Involves computation of a test statistic Study Sample
Power
No Association Beta (Type II error) Correct (1 – alpha)
– compared with a critical value set by the significance level False negative True negative
of the test
• Significance level ()
• Power = 1 – beta
– chance of rejecting the null hypothesis when, in fact, it is
true.
The P Value The P Value
• Indicates the probability that the findings • For studies with a small sample size the
observed could have occurred by chance sampling variability may be large
alone. – can lead to a non‐significant test even if the
• When p value ≤ significance level () observed difference is caused by a real effect.
– Difference is statistically significant
• When p value > significance level ()
– Difference is not statistically significant
Confidence Interval (CI) Confidence Interval (CI)
• A computed interval of values that, with a • Interpretation
given probability, contains the true value of – If we sample a population 100 times, 95% CI will
the population parameter. contain the true population value 95 times.
• The degree of confidence is usually stated as • Influenced by variability of the data and
a percentage sample size.
– Commonly 95% CI is calculated. • It is also a measure of precision of the
• Setting value (type I error) at 5% estimate
2
8/4/2017
Statistical significance, p value, CI and
sample size Clinical vs. Statistical Significance
Results of five hypothetical studies on the risk of breast cancer following childhood
exposure to tobacco smoke
• Small differences in disease frequency or
small OR (or RR) may be statistically
Study (No. of Relative Risk 95% Confidence P value Statistically
subjects) Interval significant significant
A (n=2,500) 1.4 1.2 – 1.7 0.02 – may have no clinical significance.
B (n=500) 1.7 0.7 – 3.1 0.10
C (n=2,000) 1.6 1.2 – 2.1 0.04
• Conversely, clinically important differences
D (n=250) 1.8 0.6 – 3.9 0.30 or measures of association may not be
E (n=1000) 1.6 0.9 – 2.5 0.06 statistically significant
(Aschengrau and Seage. Epidemiology in Public Health. 3rd ed.)
– small sample sizes.
Probability Elementary properties of probability
• If an event can occur in N mutually exclusive 1. Given some experiment with n mutually
and likely ways, and if m of these possess a exclusive outcomes (called events), E1, E2, …,
characteristic, E, the probability of the En, the probability of any event Ei is assigned
occurrence of E is a nonnegative number.
– P(E) = m/N – P(Ei) ≥ 0
– P(rolling a five)
• Two events are said to be mutually exclusive if – So all events must have a probability greater than
they cannot occur simultaneously or equal to zero.
Elementary properties of probability Elementary properties of probability
2. the sum of the probabilities of all mutually 3. For any 2 mutually exclusive events, A and B,
exclusive outcomes is equal to 1. the probability of the occurrence of A or B is
equal to the sum of their individual
– P(E1)+P(E2)+ … +P(En) = 1 probabilities.
P(A or B) = P(A) + P(B)
3
8/4/2017
Frequency of daily alcohol use among adults Conditional Probability
Daily drinks Male Female Total
0 12 15 27 • Conditional probability is a probability that
1‐2 36 35 71 only involves a subset of a total group.
3 or more 52 30 82
Total 100 80 180
– The group has been restricted due to conditions
or characteristics
What is the probability that a person randomly chosen from the 180 – If we chose a person at random from the 180
subjects is male?
subjects and find that the subject is female.
P(male) = number of males / total number of subjects • What is the probability that this female is one who
= 100 / 180 = 0.556 drinks 3 or more drinks a day?
• P(3+ drinks | female)
Joint Probability Multiplication Rule
• The probability that a subject picked at
random possesses two characteristics at the • P(A and B) = P(B) * P(A|B); if P(B)≠ 0
same time.
– What is the probability that a person randomly • P(A and B) = P(A) * P(B|A); if P(A) ≠ 0
chosen from the 180 subjects is female and drinks
3 or more drinks a day?
• P(3+ drinks and Female)
• Can also be wri en as P(3+ drinks ∩ Female)
The Addition Rule Elementary properties of probability
3. For any 2 mutually exclusive events, A and B,
• Given two events A and B, the probability that
the probability of the occurrence of A or B is
event A occurs, or event B occurs, is the
equal to the sum of their individual
probability that A occurs plus the probability
probabilities.
that B occurs minus the probability that the
events occur simultaneously.
P(A or B) = P(A) + P(B)
• P(A or B) = P(A) + P(B) – P(A and B)
4
8/4/2017
The Addition Rule Daily drinks
0
Male
12
Female
15
Total
27
1‐2 36 35 71
• What is the probability that a person randomly 3 or more 52 30 82
3 or more drinks a day?
• Conditional probability: P(3+ drinks | female)
– P(Female or 3+ drinks) = P(Female U 3+ drinks)
• Joint probability: P(3+ drinks ∩ Female)
– Multiplication rule
• What is the probability that a person randomly
chosen from the 180 subjects is some who • Addition rule: P(3+ drinks U Female)
doesn’t drink or drinks 3 or more drinks a day?
– P(0 drink or 3+ drinks) = P(0 drink U 3+ drinks) • Addition rule: P(0 drink U 3+ drinks)
– 3rd elementary property of probability (mutually exclusive)
– Third elementary property of probability
Bayesian Statistics Bayesian Statistics
• Classical statistical approach (frequentist) • Prior
– everything known about the answer prior to the new study
– decisions made solely on the basis of the data – Prior probability
collected in the study • probability based on prior knowledge, prior experience, or
results derived from prior data collection activity
• Bayesian approach • Posterior
– uses the results of a study to modify, in a – change in knowledge based on the information from the
new study
quantitative way, your prior expectations (beliefs)
– Posterior probability
of the relationship you are studying • probability obtained by using new information to update or
revise a prior
Bayesian Statistics References
• As more data (information) gathered more likely to
know the “true” probability of the event 1. Merrill RM. Statistical Methods in Epidemiologic Research. 1st ed.
Burlington, MA: Jones and Bartlett Learning; 2015.
• Bayes’ theorem 2. Aschengrau A, Seage GR. Essentials of Epidemiology in Public Health. 3rd
∗ ed. Burlington, MA: Jones and Bartlett Learning; 2013.
• [where P(B)>0]
∗ ∗
3. Daniel, W.W. Biostatistics: A Foundation for Analysis in the Health Sciences.
9th ed.: John Wiley and Sons, Inc.; 2009.
– Predictive value positive can be explained using Bayes’ theorem 4. Fletcher RH, Fletcher SW, Fletcher GS. Clinical Epidemiology: The
∗ Essentials. 5th ed. Philadelphia, PA: Lippincott Williams & Wilkins; 2012.
– ∗ ∗
5. Glantz SA. Primer of Biostatistics. 7th ed. New York, NY: McGraw Hill; 2011.
6. Merrill RM. Fundamentals of Epidemiology and Biostatistics: Combining
• Limitation: difficulty in obtaining good estimates of the the Basics. 1st ed. Burlington, MA: Jones and Bartlett Learning; 2012.
prior probabilities