Sei sulla pagina 1di 29

Important statistical terms

Population:
a set which includes all
measurements of interest
to the researcher
(The collection of all
responses, measurements, or
counts that are of interest)

Sample:
A subset of the population
Why sampling?

Get information about large populations


 Less costs
 Less field time
 More accuracy i.e. Can Do A Better Job of
Data Collection
 When it’s impossible to study the whole
population
Target Population:
The population to be studied/ to which the
investigator wants to generalize his results
Sampling Unit:
smallest unit from which sample can be selected
Sampling frame
List of all the sampling units from which sample is
drawn
Sampling scheme
Method of selecting sampling units from sampling
frame
Types of sampling

 Non-probability samples

 Probability samples
Non probability samples

 Convenience samples (ease of access)


sample is selected from elements of a population
that are easily accessible
 Snowball sampling (friend of friend….etc.)

 Purposive sampling (judgemental)

 You chose who you think should be in the


study
 Quota sample
Non probability samples

Probability of being chosen is unknown


Cheaper- but unable to generalise
potential for bias
Probability samples

 Random sampling
 Each subject has a known probability of
being selected
 Allows application of statistical sampling
theory to results to:
 Generalise
 Test hypotheses
Conclusions

 Probability samples are the best

 Ensure
 Representativeness
 Precision
Methods used in probability
samples

 Simple random sampling


 Systematic sampling

 Stratified sampling

 Multi-stage sampling

 Cluster sampling
Simple random sampling
Table of random numbers

684257954125632140
582032154785962024
362333254789120325
985263017424503686
Systematic sampling

Sampling fraction
Ratio between sample size and population
size
Systematic sampling
Cluster sampling
Cluster: a group of sampling units close to each
other i.e. crowding together in the same area or
neighborhood
Cluster sampling
Section 1 Section 2

Section 3

Section 5

Section 4
 Stratified sampling
 Multi-stage sampling
Errors in sample

Systematic error (or bias)


Inaccurate response (information bias)
Selection bias

Sampling error (random error)


Type 1 error
 The probability of finding a difference with
our sample compared to population, and
there really isn’t one….

 Known as the α (or “type 1 error”)

 Usually set at 5% (or 0.05)


Type 2 error

 The probability of not finding a difference


that actually exists between our sample
compared to the population…

 Known as the β (or “type 2 error”)

 Power is (1- β) and is usually 80%


Sample size

Quantitative Qualitative

2
Z σ 2 Z2 π(1  π)
n n
D2 D2

(σ12  σ 22 )xF 2 P (1 - P) F
n n
D 2
D2
Problem 1
A study is to be performed to determine a
certain parameter in a community. From a
previous study a sd of 46 was obtained.
If a sample error of up to 4 is to be
accepted. How many subjects should be
included in this study at 99% level of
confidence?
Answer
2
Z σ 2
n
D 2

2
2.58 x 46 2
n  880.3 ~ 881
42
Problem 2
 A study is to be done to determine effect
of 2 drugs (A and B) on blood glucose
level. From previous studies using those
drugs, Sd of BGL of 8 and 12 g/dl were
obtained respectively.
 A significant level of 95% and a power of
90% is required to detect a mean
difference between the two groups of 3
g/dl. How many subjects should be include
in each group?
Answer
(σ  σ )xF
2 2
n 2
1 2
D

(8  12 )x10.5
2 2

n 2
 242.6 ~ 243
3
in each group
Problem 3
It was desired to estimate proportion of
anaemic children in a certain preparatory
school. In a similar study at another school
a proportion of 30 % was detected.
Compute the minimal sample size required
at a confidence limit of 95% and accepting
a difference of up to 4% of the true
population.
Answer
Z π(1  π)
2
n 2
D

1.96 x 0.3(1  0.3)


2
n 2
 504.21 ~ 505
(0.04)
Problem 4
In previous studies, percentage of
hypertensives among Diabetics was 70%
and among non diabetics was 40% in a
certain community.
A researcher wants to perform a
comparative study for hypertension among
diabetics and non-diabetics at a
confidence limit 95% and power 80%,
What is the minimal sample to be taken
from each group with 4% accepted
difference of true value?
Answer
2 P (1 - P) F
n 2
D

2 x 0.55 (1 - 0.55) x7.8


n 2
 2413 .2
0.04
Precision
Cost

Potrebbero piacerti anche