Sei sulla pagina 1di 16

WEEK 3 – UNIT 11

BIVARIATE STATS
MCO552
Data in Business Journalism
Prof. Steve Doig
Two kinds of relationships
• Deterministic: You can
predict one variable exactly
given another
• Statistical: You can describe
a relationship between
variables, but it isn’t precise
because of natural variability
Scatterplot
Strength of Relationship?
Correlation (also called the
correlation coefficient or
Pearson’s r) is the measure of
strength of the linear
relationship between two
variables.

Think of strength as how closely


the data points come to falling
on a line drawn through the
data.
Scatterplot with trendline
Features of Correlation (r)
• R can range from +1 to -1 • Zero correlation means the
• Positive correlation: As best line through the data is
one variable increases, the horizontal
other increases • Correlation isn’t affected by
• Negative correlation: As the units of measurement
one variable increases, the
other decreases
Variables
• When considering relationships
between measurement variables,
there are two kinds:
• Explanatory (or independent)
variable: The variable that attempts to
explain or is purported to cause (at
least partially) differences in the…
• Response (or dependent or outcome)
variable
• Often, chronology is a guide to
distinguishing them.
Positive Correlations

r = +.1 r = +.4

r = +.8 r = +1
Negative Correlations

r = -.4

r = -.1

r = -.8 r = -1
Zero correlation

r=0 r=0
Zero correlation
Number of Points Don’t Matter

r = .8 r = .8
Important!
Correlation
does not imply
causation.
Spurious correlations
Some reasons for correlation
• One variable is causing
change in the other
• Explanatory variable is a
contributing – but not sole --
cause of change
• Confounding (lurking)
variables may exist
• Both variables affected by a
common cause
• Both are changing over time
• Pure coincidence
END OF UNIT 11 VIDEO

Potrebbero piacerti anche