Sei sulla pagina 1di 41

Mediation and Multi-group Analyses Lyytinen & Gaskin

In an intervening variable model, variable X, is

postulated to exert an effect on an outcome variable, Y, through one or more intervening variables called mediators (M) mediational models advance an X M Y causal sequence, and seek to illustrate the mechanisms through which X and Y are related. (Mathieu & Taylor)

Why Mediation?
Seeking a more accurate explanation of the

causal effect the antecedent (predictor) has on the DV (criterion , outcome) focus on mechanisms that make causal chain possible Missing variables in the causal chain
Intelligence Performance

Intelligence Work Effectiveness Performance

Conditions for mediation

(1) justify the causal order of variables including temporal precedence; (2) reasonably exclude the influence of outside factors; (3) demonstrate acceptable construct validity of their measures; (4) articulate, a priori, the nature of the intervening effects that they anticipate; and (5) obtain a pattern of effects that are consistent with their anticipated relationships while also disconfirming alternative hypotheses through statistical tests.

Conditions for mediation

Inferences of mediation are founded first and

foremost in terms of theory, research design, and the construct validity of measures employed, and second in terms of statistical evidence of relationships. Mediation analysis requires: 1) inferences concerning mediational X MY relationships hinge on the validity of the assertion that the relationships depicted unfold in that sequence (Stone-Romero & Rosopa, 2004). As with SEM, multiple qualitatively different models can be fit equally well to the same covariance matrix. Using the exact same data, one could as easily confirm a YMX mediational chain as one can an XMY sequence (MacCallum, Wegener, Uchino, & Fabrigar, 1993).

Conditions for mediation

2) experimental designs is to isolate and test, as best as possible, XY relationships from competing sources of influence. In mediational designs, however, this focus is extended to a three phase XMY causal sequence requiring random assignments to both X and M and related treatments
Because researchers may not be able to randomly assign participants to conditions, the causal sequence of XMY is vulnerable to any selection related threats to internal validity (Cook & Campbell, 1979; Shadish et al., 2002). To the extent that individuals status on a mediator or criterion variable may alter their likelihood of experiencing a treatment, the implied causal sequence may also be compromised. For example, consider a typical: trainingselfefficacyperformance, mediational chain. If participation in training is voluntary, and more efficacious people are more likely to seek training, then the true sequence of events may well be self-efficacytrainingperformance. If higher performing employees develop greater self-efficacy (Bandura, 1986), then the sequence could actually be performanceefficacytraining. If efficacy and performance levels remain fairly stable over time, one

Conditions of mediation
It is a hallmark of good theories that they articulate the how

and why variables are ordered in a particular way (e.g., Sutton & Staw, 1995; Whetten, 1989). This is perhaps the only basis for advancing a particular causal order in nonexperimental studies with simultaneous measurement of the antecedent, mediator, and criterion variables (i.e., classic cross-sectional designs). Implicitly, mediational designs advance a time-based model of events whereby X occurs before M which in turn occurs before Y. It is the temporal relationships of the underlying phenomena that are at issue, not necessarily the timing of measurements In other words, in mediation analyses, omitted variables represent a significant threat to validity of the XM relationship if they are related both to the antecedent and to the mediator, and have a unique influence on the mediator. Likewise omitted variables (and related paths) may lead to conclude falsely that no direct effect XY exists, while in fact it holds in the population

Importance of theory Cause and effect

Training Selfefficacy Performanc e Selfefficacy Performanc e Selfefficacy


Performanc e Selfefficacy



Performanc e

Types of Mediation

Significant Path Insignificant Path

Indirect Effect

Partial Mediation

Full Mediation

More complex mediation structures

Chain Model
X M1 M2 M3 Y

M1 X


Parallel Model

Hypothesizing Mediation
All types of mediation need to be explicitly and with

good theoretical reasons and logic hypothesized before testing them Indirect Effect
You still need to assume and test that X has an

indirect effect on Y, though there is no effect in path XY X has an indirect, positive effect on Y, through M.
Partial or Full
M partially/fully mediates the effect of X on Y. The effect of X on Y is partially/fully mediated by M.

The effect of X on Y is partially/fully mediated by M1,

M2, & M3.

Statistical evidence of relationships.

Each type of mediation needs to be backed by appropriate


statistical analysis Sometimes the analysis can be based on OLS, but in most cases it needs to be backed by SEM based path analysis There are four types of analyses to detect presence of mediation relationships 1. Causal steps approach (Baron-Kenny 1986) (tests for significance of different paths) 2. Difference in coefficients (evaluates the changes in betas/coefficients and their significance when new paths are added to the model) 3. Product of effect approach (tests for indirect effects a*b this always needs to be tested or evaluated using bootstrapping) 4. Sometimes evaluating differences in R squares

Statistical evidence of relationships

Convergent validity is critical for mediation tests as this


forms the basis for reliability especially poor reliability of mediator as to the extent that a mediator is measured with less than perfect reliability, the MY relationship would likely be underestimated, whereas the XY would likely be overestimated when the antecedent and mediator are considered simultaneously (see Baron & Kenny 1986) Discriminant validity must be gauged in the context of the larger nomological network within which the relationships being considered are believed to reside. Discriminant validity does not imply that measures of different constructs are uncorrelated the issue is whether measures of different variables are so highly correlated as to raise questions about whether they are assessing different constructs. It is incumbent on researchers to demonstrate that their measures of X, M, and Y evidence acceptable discriminant validity before

Statistical evidence of relationships


Statistical evidence of the relationships

In simple partial mediation mx is the coefficient for X for

for predicting M, and ym.x and yx.m are the coefficients coefficients predicting Y from both M and X, respectively. respectively. Here yx.m is the direct effect of X, whereas whereas the product mx*ym quantifies the indirect effect effect of X on Y through M. If all variables are observed observed then yx = yx.m + mx*ym or mx*ym = yx yx.m Indirect effect is the amount by which two cases who differ by one unit of X are expected to differ on Y through Xs effects on M, which in turn affects Y Direct effect part of the effect of X on Y that is independent of the pathway through M Similar logic can be applied to more complex situations

What would be the paths here?


Statistical analysis
The testing of the existence of the mediational effect

effect depends on the type of indirect effect The lack of direct effect XY (yx is either zero or not not significant) is not a demonstration of the lack of of mediated effect Therefore three different situations prevail (in this order)
1. 2. 3.

The presence of a indirect effect (mx*ym is significant) The presence of full mediation (yx is significant but but yx.m is not) The presence of partial mediation (yx is significant and yx.m is non zero and significant)

Testing for indirect effect


Testing for full mediation


Testing for partial mediation


Observations of statistical analysis

The key is to test for the presence of a significant

indirect effect just demonstrating the significant of paths yx, yx.m,mx.y, and mx is not enough One reason is that Type I testing of statistical significance of paths is not based on inferences on indirect effects as products of effects and their quantities Can be done either using Sobel test (see e.g. or bootstrapping
Sobel tests assumes normality of product terms and

relatively large sample sizes (>200) Lacks power with small sample sizes or if the distribution is not normal

Bootstrapping (available in most statistical packages, or there is


additional code to accomplish it for most software packages) Samples the distribution of the indirect effect by treating the obtained sample of size n as a representation of the population as a minitiature and then resampling randomly the sample with replacement so that a sample size n is built by sampling cases from the original sample by allowing any case once drawn to be thrown back to be redrawn as the resample of size n is constructed mx and ym and their product is estimated for each sample recorded The process is repeated for k times where k is large (>1000) Hence we have k estimates of the indirect effect and the distribution functions as an empirical approximation of the sampling distribution of the indirect effect when taking the sample of size n from the original population Specific upper and lower bound for confidence intervals are established to find ith lowest and jst largest value in the ordered rank of value estimates to reject the null hypothesis that the indirect effect is zero with e.g. 95 level of confidence

Observations of statistical analysis

In full and partial mediation bivariate XY (assessed via


correlation rYX or coefficient yx) must be nonzero in the population if the effects of X on Y are mediated by M Hence establishing a significant bivariate is conditional on sample size For example Assume that N=100 and sample correlations rXM=.30 and rMY =.30 and both would be significant at p<.05. However sample correlation rXY =.09 would not! Hence tests for full mediation can be precluded if this is the true model in the population This point become even more challenging when complex mediations XM1M2M3Y are present. Hence many times full mediations are not detected due to underpowered designs; the same holds for interactions or suppression variables; in fact four step Baron Kenny has power of .52 with a sample size of 200 to detect medium effect! This can be overcome by bootstrapping

Observations of statistical analysis

Testing for full mediation requires that yx.m is zero.

zero. When yx.m does not drop zero the evidence supports partial mediation. This requires researchers to researchers to make a priori hypotheses concerning full concerning full or partial mediation and transforms transforms confirmatory tests to exploratory data mining What counts as significant reduction in yx vs. yx.m is yx.m is not clear (c.f. from .15 to .05 vs. .75 to .65) Typically the baseline model for mediation is partial mediation while theoretical clarity and Ockams razor would speak for full mediation

Testing for Mediation in AMOS

Direct Effects First

Regression Weights

loylong loylong


ctrust atrust

Estimate .282 .184

S.E. .048 .048

C.R. 5.812 3.850

P *** ***

Testing for Mediation in AMOS

Add Mediator

Regression Weights


value value loylong loylong loylong


atrust ctrust ctrust atrust value

Estimate .210 .602 .089 .123 .312

S.E. .048 .048 .056 .047 .052

C.R. 4.400 12.452 1.592 2.638 5.935

P *** *** .111 .008 ***

Testing significance of partially mediated paths Sobel Test

Use for partially mediated relationships. Use the Sobel Test online calculator

Assumes normal distribution

and sufficiently large sample

http://www.danielsoper.c om/statcalc/calc31.aspx
Regression Weights

value value loylong loylong loylong



atrust ctrust ctrust atrust value

Estimate .210 .602 .089 .123 .312

S.E. .048 .048 .056 .047 .052

C.R. 4.400 12.452 1.592 2.638 5.935

P *** *** .111 .008 ***

Testing significance of indirect effects Bootstrapping

At least 1000

No Missing Values Allowed!


Testing significance of indirect effects Bootstrapping

p- values


Direct Effects - Two Tailed Significance

No Mediation If Indirect is > 0.05 Full Mediation Given the direct effects were significant prior to adding the mediator If Indirect < 0.05 and Direct is > 0.05 Partial Mediation If Direct & Indirect < 0.05, 30 check Total.

burnm burnc satc satw

wu 0.003 0.004 0.845 0.004

wf 0.033 0.969 0.026 0.836

aut 0.026 0.435 0.260 0.020

burnm ... ... 0.016 0.011

burnc ... ... 0.007 0.009

Indirect Effects - Two Tailed Significance

burnm burnc satc satw

wu ... ... 0.005 0.003

wf ... ... 0.546 0.115

aut ... ... 0.016 0.016

burnm ... ... ... ...

burnc ... ... ... ...

Total Effects - Two Tailed Significance

burnm burnc satc satw

wu 0.003 0.004 0.033 0.003

wf 0.033 0.969 0.024 0.174

aut 0.026 0.435 0.026 0.020

burnm ... ... 0.016 0.011

burnc ... ... 0.007 0.009

Partial Mediation
.23*** .37*** .20** .17*


Full Mediation WORDING Overall value partially mediates the effect of trust in agent on loyalty for longterm (p < 0.000). Overall value fully mediates the effect of trust in company on loyalty for longterm (p < 0.000).


Using AMOS for testing chain models and parallel models


Moderation concept
Based on the observation that independent-

dependent variable relationship is affected by another independent variable This situation is called moderator effect which occurs when a moderator variable, a second independent variable changes the form of the relationship between another independent variable and the DV Can be expanded to a situation where the mediated relationship is moderated

Moderation: affecting the effect

Moderating variables must be chosen with strong


theoretical support (Hair et al 2010) The causality of the moderator cannot be tested directly Becomes potentially confounded as moderator becomes correlated with either of the variables in the relationship Testing easiest when moderator has no significant relationship with other constructs This assumption is important in distinguishing moderator from mediators which (by definition) are related to both constructs of the mediated

Moderation: Multi-group
Non-Metric moderators: categorical variables are

hypothesized as moderators (gender, age, turbulence vs. non-turbulence, non customer vs. customer) For non-metric variables a multi-group analysis is applied i.e. data is split for separate groups for analysis based on variable values and tested for statistical difference (both for measurement and structural model)


Multi-group example
Exercis e Weight Loss

Low Protein

Exercis e

Weight Loss

High Protein

Moderator vs. Mediator

Mediator: the means by which IV affects DV

Moderator: a variable that influences the magnitude

of the effect an IV has on a DV



Mediation vs. Moderation Example

Notice that the mediator and the moderator can be th Can a mediator also be used as a moderator? Yes - see Baron and Kenny 1986 for a complex exam

Some Theory-based Criteria

(i.e., arguments for mediation and moderation are based on theory first, rather than statistical correlations)

Logical effect of IV
Logical cause of DV

Not logically correlated to IV or DV (if categorical)
Holistic/multiplicative effect (interaction) Varying effect for different categorical values (multi-



Driving home the point: Moderator or Mediator?

Caloric intake

Either, Neither, One or the Other?

Exercise partner


Exercise IQ
Activity level Protein intake Attitude

reinforcement Gender Age Heredity

Exercis e M

Weight Loss

Exercis e

Weight Loss

Koufteros & Marcoulides 2006