Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
• Where; md = ∑d /n , Sd = √ [{n∑d2 –
(∑d)2 }/ n(n-1)] & d.f = n-1
• N.B) t-test is not used with proportions.
• N.B) from the rules of confidence intervals
using t-distribution we notice that the
interval is affected by sample size more
than with using Z distribution as with t-
distribution even the tabulated value is
affect by the sample size and not only the
standard error.
• Example 1: using specimens obtained
from 10 individuals for teeth contents of
calcium with m = 35.7 & S = 0.7, what is
the expected population mean?
• Estimation
• 95%C.I for μ = m ± (9t0.975 * S /√n) = 35.7 ±
(2.262 * 0.7/√10) = 35.15 to 36.17
• Example 2: a breast cancer researcher
collected the following data in tumors size,
type A (n1=21, m1=3.85 &S1=1.95cm) & for
type B (n2 = 16, m2 =2.8 & S2=1.70). What
is the 95%C.I for the difference in
population's means?
• Here the two groups are of two different
cancers types, so population variances
are different (unequal).
• d.f = (S12/n1 + S22/n2)2 / [(S12/n1)2/ n1 +
(S22/n2)2/ n2 ] =36.4 (while in using d.f = n1
+ n2 – 2 = 35, so using any of the two will
point towards the same tabulated value)
• 95%C.I for (μ1-μ2) = (m1-m2) ± [ d.ft(1-∞/2) * √
{(S12/n1)+( S22/n2)}]
= (3.85-2.8) ± 2.0307 * √
[(1.95)2/21 + (1.7)2/16] = - 0.17 to 2.27 (not
significant as it included the zero)
• Example 3: in a cancer institute, a
research was done on drugs to prolong life
in patients with throat cancer, 200 patients
were randomly divided into 2 equal
groups: on a drug (m1 = 4.6months, S1 =
2.5 months) & on placebo (m2 = 3.4
months & S2 = 1.3). Assuming normality &
equal population variances; find 95%C.I
for (μ1-μ2)? Interpret the result?
• Sp = √ [(n1-1) S12 + (n2-1) S22 / (n1 +n2 -2)]
= 2.816
• d.f = 100 + 100 - 2 = 198
• 198t0.975 = 1.9719 (from t-table)
• %C.I for (μ1-μ2) = (m1-m2) ± [d.ft(1-∞/2) *
Sp√(1/n1+1/n2)]
• 95%C.I for (μ1-μ2) = (4.6-3.4) ± [1.9719 *
2.816√(1/100+1/100)] = 0.415 to 1.985
• Since zero is not included in the interval,
then the drug under trial significantly
prolongs life.
• Example 4: in a pediatric clinic a study
was carried out to see the effectiveness of
a certain antipyretic on (12) 4-years-old
girls suffering from flu, their temperature
is taken immediately before and 1hr after
drug administration, for the following
results find 95%C.I for the mean
difference?
i.d no. Before After D d2
Female 62 50.82 %
Male 60 49.18 %
520
Deaths in Thousands
500
480
460
440
420
400
380
79
81
83
85
87
89
91
93
95
97
99
01
Years
Males Females
Source: CDC/NCHS.
Histogram
Graphical display of
frequency distribution
of quantitative variable .
The values of the
quantitative variable( as
class interval) will be
placed on the X-axis
(representing the width of
the rectangles), and the
corresponding frequency
(or relative frequency) will
be placed on the Y-axis
(representing the height of
the rectangles)
0 15 25 35 45 55
Frequency Polygon
Another form of
graphical presentation
of frequency
distribution of
quantitative variables.
It is similar to the
histogram , but instead
of using rectangles to
present data, the
midpoint of the top of
each rectangle are
plotted , and
connected together by
straight lines.
Frequency Polygon
A p a i r o f
measurements is
plotted as a single
point on a graph.
The value of one
variable of each
pair is plotted on
the X axis and the
value of the other
variable is plotted
on the Y axis
Showing the relation
• A single and effective form to examine the
relation between two quantitative variables
is using a scattered points graph.
• Each point correspond at one subject.
Scatter diagram
Deaths in Thousands
Foam cells
Fibrous cap
Lipid core
Thrombus
<160 31 3 3
Disease Status
+ - Totals
Where TP = true positive, TN = true negative, FP = false positive & TN = true negative.
• Gold Standard Test:
is the test that never mistakes, i.e. if
there is disease, so the test is positive,
if there is no disease then the test will
be negative. But it is difficult &/or
costy, so we use simple tests but with
probability to give false readings (FP
& FN) giving results as in the table
above.
Example
on the Bacilli in sputum
diagnosis
of T.B
+ - Totals
+ 7 4 11
CXR
- 3 86 89
Totals 10 90 100
So there are 4 persons diagnosed as TB with CXR while they are not & 3
.
cases of TB labeled as not diseased while they are diseased
• Marginal Probability: it is called like this
because it deals with probability of the margin of the
table (cells of small totals or cells of marginal totals),
from the example above:
• Probability of testing positive = P (T+) = 11/100=
0.11
• Probability of testing negative = P (T-) = 89 /100 =
0.89
• Probability of having the disease = P (D+) = 10 /100
= 0.10
• Probability of not having the disease = P (D-) =
90/100 = 0.90
• Joint Probability: the probability of
2 events or more to occur simultaneously,
or the probability of random picking of a
subject (from a group) has 2 events
simultaneously. From previous example:
• Probability of testing disease+ & test+ =
P (D+&T+) = 7/100 = 0.07
• Probability of testing disease+ & test- = P
(D+&T-) = 3/100 = 0.03
• Conditional Probability: is the probability of an
event occurring given that another event has already occurred.
+ - total
+ 175 9 184
Gd test
(Test
Result)
- 8 48 56
+ - Total
+ 90 30 120
The state
of eating - 20 60 80
Barbecue
total 110 90 200
PROBABILITY DISTRIBUTION
PROBABILITY DISTRIBUTION
• PROBABILITY DISTRIBUTION
• Z = [(p1-p2)-(P1-P2)] / √ [{P1(1-P1)}/n1 +
{P2(1-P2)}/n2]
• Example: a population of teenagers has
the proportions of 10% for obese boys (P1)
& of 10% for obese girls (P2). What is the
probability that a random sample of 250
boys (n1) and of 200 girls (n2) will yield a
value of ≥ 0.06 (p1-p2)?
• Z = [(p1-p2)-(P1-P2)] / √ [{P1 (1-P1)} /n1 +
{P2 (1-P2)} /n2]
• = [0.06-(0.1-0.1)] / √ [{0.1(1-0.1)}/250 +
{0.1(1-0.1)}/200] = 2.11
• 0.9826 from z-table → 1 – 0.9826 =
0.0174
Thank you
for
listening
Biostatistics
M.I (X1)
a b a+b
Not (X2)
c d c+d
totals
a+c b+d a+b+c+d
• Hypothesis and conclusion are stated on
in terms of association or lack of
association of the two variables. (H0: no
association & HA: there is an association).
• Critical value = Tabulated X2 = d.f X2 (1-α)
(from the X2 table)
• N.B : X2 distribution curve is a single tail
curve so α is not divided by 2.
Smoking No Total
M.I
Non-M.I
totals
M F Total
Totals 54 46 100
• Expected value for each cell =
multiplication of marginal totals/ grand total
• So for cell a, E30 = 50*54/100= 27, & so
23 for cell b, 27 for c & 23 for d.
• Data: the two randomly selected samples,
1st of 50 leukemic children consisting of 30
M & 20 F, and 2nd sample of 50 healthy
children consisting of 24 M & 26 F.
• Assumption: the two samples represent 2
independent groups are taken from 2
independent populations.
• Hypotheses:
• HO: no significant difference in M & F
frequencies with and without leukemia. Or there
is no association between leukemia and gender
type.
• HA: there is significant…
• Level of significance: α= 0.05 → 5% chance
factor effect. 95% influencing
factor effect.
• d.f. = (r-1)(c-1)= (2-1)(2-1)= 1*1 = 1
• Critical point = tabulated X2 = d.fX2 1-a = 1X2 0.05 =
3.841
• Testing for significance:
• Calculated X2 = ∑ [(O - E)2/E]
• = (30-27)2/27 + (20-23)2/23
+ (24-27)2/27 + (26-23)2/23 = 1.448
• As calculated X2 < tabulated X2→ p > 0.05,
so we accept Ho and reject HA
Predict the reduction in heart rate obtained by giving drug with a dose of 3 mg?
Q2/ the following data represent a sample of 7
patients with pneumonia with respect to the
duration of illness (in days) and body
temperature in (oC).
Body 38.1 38.7 38.6 39.1 38.9 40.1 40.0
temperat
ure
Duration 1 2 3 4 5 6 7
of illness
Sample mean: m = (∑ x) / n ,
Where ; n = no. of value in the sample.
MEDIAN: is the value that divides the sets of data into
two equal parts (i.e. the no. of values above the median
equals to the no. of values below it). To find the site of
the median we must arrange the value in ordered array
then :
Position of median = (n + 1) / 2 this if the no. of
observations is odd.
Position of median = n / 2 & (n / 2) + 1 i.e. 2 sites
if the no. of observations is even. The median here is
the average of the readings lie in these two positions.
e.g. give the median of these values:
1st set of data: 5, 15, -7, 20, 25, 3, -1, 0 & -3.
2nd set of data: 7, 9, 16, -5, -9, 3, -4, 6.
Solution: ordered array:
-7, -3, -1, 0, 3, 5, 15, 20, 25 (1stset)
-9, -5, -4, 3, 6, 7, 9, 16 (2nd set)
Median position= (9+1)/2=5, i.e. the 5th reading,
median=3 in 1st set
= 8/2 = 4 & (8/2) + 1 = 5 → 3 & 6, so we take their
average to have one median for the set (3+6)/2= 4.5 =
median of 2nd set.
Advantages of the median: simple to calculate & to
understand, unique & the most important; not
affected by extreme values.
CV = (SD / m) *100
SE = SD / √ n
SE determines the dispersion of the sample mean from
the population mean (determines sampling error), i.e.
the representativeness of the sample to the population,
so it gives us the idea for how much the sample mean is
far away from the population mean.
From the rule above we can say that the sampling error
(standard error of the sample mean) is inversely related
to the sample size.
Properties of Variance, SD & SE:
1. Are based on all observations.
2. Deviations are taken from the mean.
3. Most widely used measures of
variability.
4. For SD & SE, the unit is the same as for
the mean.
Properties of CV:
1. Used for the comparison of relative variability of 2
distributions.
2. It measures level of variability in the data relative to
the average value.
3. It is independent of any unit of measurement so it is
useful for comparison of variability in 2 distributions
having variables expressed in different units.
4. Takes into account each value of the distribution.
5..Expresses the SD as a percentage from the mean
Example: a sample of 15 patients
making visits to a health center
had traveled these distances
(miles) : calculate the measures of
central tendency and dispersion?
Patient Distance (x) X2
1 5 25
2 9 81
3 11 121
4 3 9
5 12 144
6 13 169
7 12 144
8 6 36
9 13 169
10 7 49
11 3 9
12 15 225
13 12 144
14 15 225
15 5 25
H0 = 0 & HA ≠ 0