Sei sulla pagina 1di 17

Evidence-Based Medicine:

Year 1
Matt Kean
Manchester Medical School

matthew.kean@manchester.ac.uk

Session 5: Introduction to Medical


Statistics
In this session:
Main learning outcome: Use a statistical computer program
to conduct basic statistical analysis and interpret the
output of the analysis.

Displaying data
Summarising data
The normal distribution

Categorical Variables

Count the observations in each category

Counts are called frequencies displayed in a


frequency distribution (table).

A relative frequency is a proportion or


percentage of the total number in the sample
Method of delivery of 600 babies born in a hospital
Method of delivery

No. of births

Percentage

Normal

478

79.7

Forceps

65

10.8

Caesarean section

57

9.5

Total

600

100.0

Categorical Variables
often shown as bar graphs

or pie charts

No. of births
600
500
400
300
200
100
0

Ca
se esa
ct ria
io
n n
(5
7)

)
65
s(
p
rce
Fo

Normal

Forceps

Caesarean section

Normal
delivery (478)

Numerical Variables
To form a frequency distribution, the data may be
grouped.
Haemoglobinlevels(g/100ml)for70women:
rawdatawiththehighestandlowestvaluesunderlined
10.2

13.7

10.4

14.9

11.5

12.0

11.0

13.3

12.9

10.6

10.5

12.1

9.4

13.2

10.8

11.7

13.7

11.8

14.1

10.3

13.6

12.1
9.3

12.9

11.4

12.7

10.6

11.4

11.9

13.5

14.6

11.2

11.7

10.9

10.4

12.0

12.9

11.1

8.8

10.2

11.6

12.5

13.4

12.1

10.9

11.3

14.7

10.8

13.3

11.9

11.4

12.5

13.0

11.6

13.1

9.7

11.2

15.1

10.7

12.9

13.4

12.3

11.0

14.6

11.1

13.5

10.9

13.1

11.8

12.2

Haemoglobinlevels(g/100ml)for70women:
frequencydistribution
Haemoglobin
(g/100ml)

No.of
women

Percentage

1.4

4.3

10

14

20.0

11

19

27.1

12

14

20.0

13

13

18.6

14

7.1

1515.9

1.4

Frequencies: Numerical
Variables
often shown as histograms
polygons

or frequency

Shapes of
Distributions
Normal or Gaussian:
Symmetrical and
bell-shaped, e.g.,
height
Positively skewed, or
skewed to the right,
e.g., triceps skinfold
measurement
Negatively skewed,
or skewed to the left,
e.g., period of
gestation

Bimodal e.g., hormone


levels of males and
females

Reverse J-shaped, e.g.,


survival time after
diagnosis of lung
cancer

Uniform e.g., month of


occurrence of disease
with no seasonal

Quantile
s
Equal-sized divisions of a distribution.
Examples of quantiles:

Medians divide the distribution into two equal


sizes

Quartiles divide the distribution into four


equal sizes

Deciles divide the distribution into ten equal


sizes

Percentiles (or centiles) divide the

Summarising numerical
data
Numerical variables are often summarised in two
measurements:

A measure of central tendency

A measure of the spread (or dispersion, or


variability) of values

Measures of central tendency


1. The Mean
The average value is represented by the mean.
This is the sum of the values divided by the
number of values.
x - the variable

- the sum of
n - number of observations
x

- the mean

Measures of central tendency


2. Median: The value that divides the distribution
in half.
If the data are ranked, the median is the middle
observation.Median = (n + 1) th value of ordered observations
2

3. Mode: The mode is the value that occurs


most often.

Measures of the spread of the data


Range: highest value lowest value

Simplest measure

Gives no idea of the distribution between its


values
Interquartile range: upper quartile lower quartile
Indicates spread of

16

Haemoglobin level (g/100ml)

middle 50% of distribution

Used with median for


box and whiskers plot

14 13
12 11 10
15

9
8

Box and whisker plot of the distribution of the


haemoglobin levels of 70 women

Measures of the spread of the data


Standard deviation:

Most common measure

Uses all the data values

Normal distribution
The normal (or Gaussian) distribution is a
frequency distribution with a symmetrical, bellshaped curve.
Important because:
Many observed variables are
normally distributed

0.08
Mean = 70 SD = 5

0.07
0.06

Density

Many statistical tests are


based on the normal
distribution

Bell-shaped curve

0.05
0.04

Mean = 70 SD = 10

0.03
0.02
0.01
0.00
40

50

60

70

Grades

80

90

100

Normal distribution

Outliers
Extreme values in a distribution

Can use fence values to define outlie

Task
Work in pairs
Open Task questions
Open StatsDirect data file

Potrebbero piacerti anche