Sei sulla pagina 1di 21

703 Application of Statistics in Marine Science

Lecturer: Dr Femi OYEDIRAN (FIPAN)


Descriptive and inferential statistics
Statistical terms
Scales
Discrete and Continuous data
Accuracy, precision, rounding and errors
Charts
Distributions
Central value, dispersion and symmetry

Statistics...
A set of mathematical procedures for describing,
synthesizing, analyzing, and interpreting
quantitative data the selection of an appropriate
statistical technique is determined by the research
design, hypothesis, and the data collected

What are Statistics?


Procedures for organising, summarizing, and
interpreting information Standardized techniques
used by scientists
Vocabulary & symbols for communicating about
data
A tool box

How do you know which tool to use?


(1) What do you want to know?
(2) What type of data do you have?
Two main branches:
Descriptive statistics
Inferential statistics
Descriptive and Inferential statistics
A. Descriptive Statistics: Tools for summarising,
organising, simplifying data Tables & Graphs
Measures of Central Tendency Measures of
Variability Examples: Average rainfall in
Manchester last year Number of car thefts in last
year Your test results Percentage of males in our
class.
B. Inferential Statistics: Data from sample used to
draw inferences about population
Generalising beyond actual observations
Generalise from a sample to a population
inferential statistics... mathematical tools that
permit the researcher to generalize to a population
of individuals based upon information obtained
from a limited number of research participants.
1.sampling error... the differences in samples
due to random fluctuations within the
population

sampling errors vary in size but are normally


distributed around the population mean (M) and
take the shape of a bell curve
2.standard error... the standard deviation of
the sample means (SEx)
tells the researcher by how much the
researcher would expect the sample means to
differ if the researcher used other samples
from the same population
but... the researcher does not have to
select a large number of samples from a
population to estimate the standard error
a mathematical formula can be used to
estimate the standard error... . SEx =
SD/ N - 1
a smaller standard error indicates less
sampling error
the major factor affecting the size of the
standard error of the mean is sample size
but, the size of the population standard
deviation also affects the standard error of the
mean.

Statistical terms
Population
complete set of individuals, objects or
measurements
Sample
a sub-set of a population
Variable

a characteristic which may take on


different values
Data
numbers or measurements collected
A parameter is a characteristic of a
population
e.g., the average height of all Britons.
A statistic is a characteristic of a
sample
e.g., the average height of a sample of
Britons.
Discrete and Continuous data
Data consisting of numerical
(quantitative) variables can be further
divided into two groups: discrete and
continuous.
1. If the set of all possible values,
when pictured on the number line,
consists only of isolated points.
2. If the set of all values, when
pictured on the number line, consists
of intervals.
The most common type of discrete
variable we will encounter is a
counting variable.
Accuracy and precision

Accuracy is the degree of conformity


of a measured or calculated quantity
to its actual (true) value.
Accuracy is closely related to
precision, also called reproducibility
or repeatability, the degree to which
further measurements or calculations
will show the same or similar results.
e.g. : using an instrument to measure a
property of a rock sample

t-test...

used to determine whether


two means are significantly
different at a selected
probability level

adjusts for the fact that


the distribution of scores
for small samples becomes
increasingly different from
the normal distribution as
sample sizes become
increasingly smaller
the strategy of the ttest is to compare the
actual mean difference
observed to the
difference expected by
chance
forms a ratio where
the numerator is the

difference between the


sample means and the
denominator is the
chance difference that
would be expected if
the null hypothesis
were true
after the numerator
is divided by the
denominator, the
resulting t value is
compared to the
appropriate t table
value, depending on
the probability level

and the degrees of


freedom
if the t value is equal
to or greater than the
table value, then the
null hypothesis is
rejected because the
difference is greater
than would be
expected due to
chance
there are two types
of t-tests: the t-test for
independent samples
(randomly formed) and

the t-test for


nonindependent
samples (nonrandomly
formed, e.g., matching,
performance on a
pre-/post-test,
different treatments)
ANOVA...

used to determine whether


two or more means are
significantly different at a
selected probability level

avoids the need to


compute duplicate t-tests to
compare groups
the strategy of
ANOVA is that total
variation, or variance,
can be divided into two
sources: a) treatment
variance (between
groups, variance
caused by the
treatment groups) and
error variance (within
groups variance)

forms a ratio, the F


ratio, with the
treatment variance as
the numerator
(between group
variance) and error
variance as the
denominator (within
group variance)
the assumption is
that randomly formed
groups of participants
are chosen and are
essentially the same at

the beginning of a
study on a measure of
the dependent variable
at the studys end,
the question is
whether the variance
between the groups
differs from the error
variance by more than
what would be
expected by chance
if the treatment
variance is sufficiently
larger than the error

variance, a significant
F ratio results, that is,
the null hypothesis is
rejected and it is
concluded that the
treatment had a
significant effect on
the dependent variable
if the treatment
variance is not
sufficiently larger than
the error variance, an
insignificant F ratio
results, that is, the null
hypothesis is accepted

and it is concluded that


the treatment had no
significant effect on
the dependent variable
when the F ratio is
significant and more
than two means are
involved, researchers
use multiple
comparison procedures
(e.g., Scheff test,
Tukeys HSD test,
Duncans multiple
range test)

FANOVA...

used when a research study


uses a factorial design to
investigate two or more
independent variables and the
interactions between them
provides a separate F
ratio for each independent
variable and each
interaction
Multiple

Regression...
a prediction equation that
includes more than one
predictor
predictors are variables
known to individually

predict (correlate with) the


criterion to make a more
accurate prediction
Chi

Square (2)...
a nonparametric test of
significance appropriate for
nominal or ordinal data that
can be converted to
frequencies
compares the proportions
actually observed (O) to the
proportions expected (E) to
see if they are significantly
different

the chi square value


increases as the
difference between
observed and expected
frequencies increases
ANCOVA can also be
used to increase the
power of a statistical
test by reducing
within-group (error)
variance, that is, to
make a correct
decision to reject the
null hypothesis

One-and two-tailed tests of


significance...
tests of significance that indicate

the direction in which a difference


may occur

the word tail indicates


the area of rejection
beneath the normal curve
A =B

no difference between
means; the direction can be
positive or negative

direction can be in either tail


of the normal curve
called a two-tailed test
divides the level
between the two tails of the
normal curve
A >B or A < B

there is a difference
between means; the direction
is either positive or negative
called a one-tailed test
the level is found in one
tail of the normal curve

Degrees of freedom (df)...

a statistical concept indicating that

one degree of freedom is lost each


time a population parameter is
estimated on the basis of a sample of
data from the population

indicates that there is no


true difference or
relationship between
parameters in the
populations
the ability for the sample means to

vary which is dependent upon the


number of participants and the
number of groups

for example: as the number of

participants increases (df) the value


needed to reject the null hypothesis
becomes smaller

Potrebbero piacerti anche