Sei sulla pagina 1di 7

Types of data

Nominal = data labelled according to category – with no order

Ordinal = data labelled according to category with an intrinsic order, but


without equal differences between consecutive levels, e.g. ASA or pain (mild,
mod, severe)

Parametric = data labelled according to category, with an intrinsic order with


equal distance between consecutive intervals. A.k.a. Continuous
• Can be either
o Interval Data = equal differences between numbers without a
natural zero e.g. Celcius scale (zero does not mean zero
energy)
o Ratio Data = equal differences between numbers with a natural
order (i.e. complete absence of thing being measured e.g.
Kelvin scale)

Estimation = made to try and determine parameters

Central tendency = single value representation of set of data

Mode = most commonly occurring value

Median = middle value in an ordered list of data (where the list contains an
even number of observations, the median is the average of the two central
observations)

Mean = average value

Parametric data can be represented by mean, median and mode

Non-parametric data can be represented by median and mode

Measure of Dispersion

Spread of data = distribution

Range = difference between largest and smallest values – limited use

Quartiles = expresses distribution in quarters

Inter-Quartile Range = difference between 1st and 3rd quartile (ignores 1st and
last quarters)
Variance = measures spread using all data = calculate difference between
each value, square them, then add them up.

Standard Deviation = the square root of variance – converts variance into


appropriate units

Normal Distribution = bell-shaped curve, symmetrical about central axis


(which corresponds to mean, mode and median). Standard normal curve has
a mean of 0 and a SD of 1. Area under the curve = 1.

68% of values lie within +/- 1 SD of the mean


95% of values lie within +/- 2 SD of the mean
99% of values lie within +/- 3SD of the mean

Standard Error of Mean (SEm) = quantifies uncertainty in estimate of mean

SEm = SD / √n

Standard error of mean = SD / square root of sample size

Skew = values are clustered on one side and sparse on the other.

Null hypothesis = no change is seen – i.e. observations are the same

Alternate hypothesis = change is seen

p-value
• = probability of a result occurring by chance if null hypothesis is true

lower the p-value, the lower the change the observation occurred by chance
(i.e. the null hypothesis is unlikely)

p-value of 0.05 = 5% chance

p-value >0.05 = null hypothesis is not accepted as true, but merely not
rejected

p-value <0.05 = significant = i.e. null hypothesis is rejected

p-value <0.01 = extremely significant

Error

Type 1 error = false positive = seeing a difference where there isn’t one

Type 2 error = false negative = not seeing a difference where there is one
Type 1 = Now you see it Type 2 = Now you don’t

Experimental design aims to minimise error

Error occurs due to


• Random error – due to intrinsic variation in samples (reduced by
increasing sample size)
• Systematic error – a.k.a. bias (not reduced by increasing sample size)

Bias = systematic error resulting in incorrect estimation of statistical


parameters
• Selection bias = groups aren’t comparable (reduced by randomisation)
• Measurement bias = error occurring in measuring variables (e.g.
equipment error or observer bias – reduced by blinding and
standardising equipment)

Confounding = association between study factors is distorted due to other


variables

Reduce error by
• Randomization (reduces selection bias) = equal chance of being in
either group
• Blinding (reduces measurement bias)
o Single blinded = subject doesn’t know what group they are in
o Double blinded – subject and observer don’t know what group
the subject is in
• Adequate sample size reduces error (ideal sample size can be
calculated by power analysis)

Power of a study = probability of appropriately rejecting the null hypothesis if


it is false (i.e. ability to detect a significant difference if one exists). Sample
size depends on: -
• Effect size = difference in effect between treatment and control group
(larger the effect size, the smaller the sample size needed)
• Beta-value = probability of a type 2 error = 20% (i.e. power of 80% is
needed)
• Alpha value = p-value = 0.05
• Distribution of value = parametric or non-parametric

Assessing distribution of data

= parametric tests – either Kolmogorov-Smirnov test or Q-Q plots (quantile-


quantile)
Assessing significance or data

= calculating p-values

Requires an appropriate test for the type of data being examined

Parametric = applicable to data that is normally distributed


• Student’s t-test
o Assesses null hypothesis that mean obtained is same as known
population mean
o T = (sample mean – known mean)/SE of sample mean
o When means are the same t=0
o As sample mean deviates from population mean, t increases
and p-value decreases = i.e. probability data came from different
population increases
• Student’s paired t-test
o Examines paired data (i.e. data from same subject, before and
after)
o Interested in differences between individuals, NOT populations
o T = (mean difference before and after) / SE of difference
• ANOVA tests
o = analysis of variance

One tailed tests


• look to see if there is a difference above or below the null value

Two tailed tests


• look to see if there is a difference above and below the null value

Non-normal distribution

Nominal = Chi-squared
• compares observed values (seen in sample) and expected values
(calculated by extrapolating known data from a population to the
population study)
• Calculated by doing the following
o For each observed number subtract the corresponding expected
number (O - E)
o Then Square that (O – E)2
o Then Divide that by the corresponding expected number
[(O-E)2/E]
o Repeat this for every cell
o Add all the individual values for [(O-E)2/E] together = this is the
chi-square statistic for the table
• In order to analyse the result you will need
o A pre-determined level of significance – usually 0.05
o The degrees of freedom (df) for the data (= number in the
sample minus the number of restrictions)
 E.g. if you have 4 numbers with the restriction they must
add up to 50. Then the first 3 numbers can be anything,
e.g. 5, 10 and 15. Therefore the fourth number must be
20 (in order to make 50)
 Therefore the degrees of freedom = (4-1) = 3
• Having calculated these, the Chi-squared value is applied to a Chi-
squared distribution table.
• If your calculated Chi-squared corresponds to a p-value of 0.05 or less
then the null hypothesis can be rejected.

Ordinal = Wilcoxian signed-rank sum test, Mann-Whitney test

How to perform THE MANN-WHITNEY U TEST

1. Call one sample A and the other B.

Sample A = 7; 3; 6; 2; 4; 3; 5; 5

Sample B = 3; 5; 6; 4; 6; 5; 7; 5

2. Combine the samples into one group, and rank in ascending


order

A A A B A B A A B B B A B B A B

2 3 3 3 4 4 5 5 5 5 5 6 6 6 7 7

3. Look at each B in turn, count the number of A’s preceding each one.
Add up the total to get a U value

U= 3+4+6+6+6+7+7+8 = 47

4. Look at each A in turn, count the number of B’s preceding each


one. Add up the total to get a U value

U= 0+0+0+1+2+2+5+7 = 17

5. Use the smaller of the two U values. Compare this to the


probability table, against the total sample number. The table value
gives the probability value – the percentage probability that the
difference between the two sets of data could have occurred by
chance

Type of Data and Which test to use


2 groups, Same > 2 groups, Serial
different subjects, different measurements
subjects before and subjects
after
intervention

Continuous Unpaired t Paired t-test ANOVA Repeated


test measures
ANOVA

Ordinal Mann- Wilcoxon Kruskal- Friedman


Whitney rank Wallis

Nominal Chi-squared McNemar Chi-squared Cochran’s


test

General Definitions

Sensitivity
• Probability of diagnosing a true positive

Specificity
• Probability of diagnosing a true negative

Positive predictive value


• Probability a person has a disease when given a +ve test result

Negative predictive value


• Probability a person does not have a disease when given a –ve test
result

Risk
• Ratio of events occurring in a study group to the total number of events
across all groups
Relative Risk
• Ratio of risk in treatment group to risk in control group
• = risk in treatment / risk in control

Absolute risk reduction


• Difference in event rates between treatment and control groups
• = risk in control group – risk in treatment group

Relative risk reduction


• % reduction in events in treatment group compared with control group
• = 1 – relative risk

Odds
• ratio of probability of an event occurring to probability of it not occurring

Odds ratio
• ratio of the odds of an event occurring in one group to the odds of it
occurring in another group

Number needed to treat (NNT)


• number of patients needed to be treated to prevent one adverse
outcome
• ideally needs to be as low as possible
• = 1 / absolute risk reduction

Number needed to harm (NNH)


• number of patients needed to be treated to cause one adverse event
• Low NNH = low therapeutic index