Introduction to Causal Mediation Analysis Using R - American

Introduction to Causal Mediation Analysis Using R - American

Introduction to Causal Mediation Analysis Using R Teppei Yamamoto Massachusetts Institute of Technology ASA Webinar March 9, 2017 Identification of ...

496KB Sizes 0 Downloads 0 Views

Recommend Documents

Causal Mediation Analysis Using R - CRAN.R-project.org
Kenny procedure in mediation, linear models are fitted for both the mediator and ... to the Bayesian posterior distribut

Identification, Inference and Sensitivity Analysis for Causal Mediation
tial ignorability assumption, the average causal mediation effect (ACME) is ... Key words and phrases: Causal inference,

medflex: An R Package for Flexible Mediation Analysis using Natural
Abstract. Mediation analysis is routinely adopted by researchers from a wide range of applied disciplines as a statistic

regression and mediation analysis using mplus
The courses take place at the Johns Hopkins University, Baltimore, Maryland. The room location for all three days is She

Introduction to Cost Benefit Analysis Introduction to Cost Benefit Analysis
Task: Perform Cost Benefit Analysis. • Condition: You are a cost advisor technician with access to all regulations/cours

Introduction to Economic Analysis
Jul 24, 2006 - In preparing this manuscript, I have received assistance from many people, including. Michael Bernstein,

Introduction to Discourse Analysis
Group 46: Mick O'Donnell (7 Feb – 23 March). Modulo VI bis 311 [email protected] Laura Hidalgo (30 March – 18

Introduction to Sentiment Analysis
sentiment polarity. Proposed word list ... can be used to generate keyword lists that model better ..... chunks, of whic

Introduction to Sentiment Analysis
Sentiment lexicon: lists of words and expressions used to ... Three main ways to compile such lists: ...... 2 new episod

Introduction to Probability and Statistics Using R - R Project
parent copies (nonproprietary text files) and opaque copies (everything else). See the GNU-FDL in ..... It can do absolu

Introduction to Causal Mediation Analysis Using R Teppei Yamamoto Massachusetts Institute of Technology ASA Webinar March 9, 2017

Identification of Causal Mechanisms Causal inference is a central goal of scientific research Scientists care about causal mechanisms, not just about causal effects Randomized experiments often only determine whether the treatment causes changes in the outcome Not how and why the treatment affects the outcome Common criticism of experiments and statistics: black box view of causality Question: How can we learn about causal mechanisms from experimental and observational studies?

Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

1 / 47

Webinar Overview

Present a general framework for statistical analysis and research design strategies to understand causal mechanisms 1

Show that the sequential ignorability assumption is required to identify mechanisms even in experiments

2

Offer a flexible estimation strategy under this assumption

3

Introduce a sensitivity analysis to probe this assumption

4

Illustrate how to use statistical software mediation

5

Consider research designs that relax sequential ignorability (time permitting)

Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

2 / 47

Causal Mediation Analysis Graphical representation Mediator, M

Treatment, T

Outcome, Y

Goal is to decompose total effect into direct and indirect effects Alternative approach: decompose the treatment into different components Causal mediation analysis as quantitative process tracing Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

3 / 47

Example: Psychological Study of Media Effects Large literature on how media influences public opinion A media framing experiment of Brader et al.: 1

(White) Subjects read a mock NYT article about immigration: Treatment: Hispanic immigrant in the story Control: European immigrant in the story

2

Measure attitudinal and behavioral outcome variables: Opinions about increasing or decrease immigration Contact legislator about the issue Send anti-immigration message to legislator

Why is group-based media framing effective?: role of emotion Hypothesis: Hispanic immigrant increases anxiety, leading to greater opposition to immigration The primary goal is to examine how, not whether, media framing shapes public opinion Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

4 / 47

Causal Mediation Analysis in Brader et al. Anxiety, M

Media Cue, T

Immigration Attitudes, Y

Does the media framing shape public opinion by making people anxious? An alternative causal mechanism: change in beliefs Can we identify mediation effects from randomized experiments? Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

5 / 47

The Standard Estimation Method Linear models for mediator and outcome: Yi = α1 + β1 Ti + ξ1> Xi + 1i Mi = α2 + β2 Ti + ξ2> Xi + 2i Yi = α3 + β3 Ti + γMi + ξ3> Xi + 3i where Xi is a set of pre-treatment or control variables 1 2 3 4

Total effect (ATE) is β1 Direct effect is β3 Indirect or mediation effect is β2 γ Effect decomposition: β1 = β3 + β2 γ.

Some motivating questions: 1 2 3

4

What should we do when we have interaction or nonlinear terms? What about other models such as logit? In general, under what conditions can we interpret β1 and β2 γ as causal effects? What do we really mean by causal mediation effect anyway?

Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

6 / 47

Potential Outcomes Framework of Causal Inference Observed data: Binary treatment: Ti ∈ {0, 1} Mediator: Mi ∈ M Outcome: Yi ∈ Y Observed pre-treatment covariates: Xi ∈ X

Potential outcomes model (Neyman, Rubin): Potential mediators: Mi (t) where Mi = Mi (Ti ) Potential outcomes: Yi (t, m) where Yi = Yi (Ti , Mi (Ti ))

Total causal effect: τi ≡ Yi (1, Mi (1)) − Yi (0, Mi (0)) Fundamental problem of causal inference: only one potential outcome can be observed for each i Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

7 / 47

Back to the Example

Mi (1): Level of anxiety individual i would report if he read the story with Hispanic immigrant Yi (1, Mi (1)): Immigration attitude individual i would report if he read the story with Hispanic immigrant and reports the anxiety level Mi (1) Mi (0) and Yi (0, Mi (0)) are the converse

Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

8 / 47

Causal Mediation Effects Causal mediation (Indirect) effects: δi (t) ≡ Yi (t, Mi (1)) − Yi (t, Mi (0)) Causal effect of the change in Mi on Yi that would be induced by treatment Change the mediator from Mi (0) to Mi (1) while holding the treatment constant at t Represents the mechanism through Mi Zero treatment effect on mediator =⇒ Zero mediation effect Example: Difference in immigration attitudes that is due to the change in anxiety induced by the treatment news story

Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

9 / 47

Total Effect = Indirect Effect + Direct Effect Direct effects: ζi (t) ≡ Yi (1, Mi (t)) − Yi (0, Mi (t))

Causal effect of Ti on Yi , holding mediator constant at its potential value that would realize when Ti = t Change the treatment from 0 to 1 while holding the mediator constant at Mi (t) Represents all mechanisms other than through Mi Total effect = mediation (indirect) effect + direct effect cf. Controlled direct effects: ξi (t, m, m0 ) ≡ Yi (t, m) − Yi (t, m0 ) Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

10 / 47

What Does the Observed Data Tell Us? Recall the Brader et al. experimental design: 1 2

randomize Ti measure Mi and then Yi

Among observations with Ti = t, we observe Yi (t, Mi (t)) but not Yi (t, Mi (1 − t)) But we want to estimate δi (t) ≡ Yi (t, Mi (1)) − Yi (t, Mi (0)) For t = 1, we observe Yi (1, Mi (1)) but not Yi (1, Mi (0)) Similarly, for t = 0, we observe Yi (0, Mi (0)) but not Yi (0, Mi (1)) We have the identification problem =⇒ Need assumptions or better research designs Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

11 / 47

Counterfactuals in the Example

Suppose that a subject viewed the news story with Hispanic immigrant (Ti = 1) For this person, Yi (1, Mi (1)) is the observed immigration opinion Yi (1, Mi (0)) is his immigration opinion in the counterfactual world where he still views the story with Hispanic immigrant but his anxiety is at the same level as if he viewed the control news story We can’t observe this because Mi (0) is not realized when Ti = 1

Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

12 / 47

Sequential Ignorability Assumption Identification assumption: Sequential Ignorability (SI) {Yi (t 0 , m), Mi (t)} ⊥ ⊥ Ti | Xi = x,

(1)

Yi (t 0 , m) ⊥ ⊥ Mi (t) | Ti = t, Xi = x

(2)

In words, 1 2

Ti is (as-if) randomized conditional on Xi = x Mi (t) is (as-if) randomized conditional on Xi = x and Ti = t

Important limitations: 1 2 3

4

In a standard experiment, (1) holds but (2) may not Xi needs to include all confounders Xi must be pre-treatment confounders =⇒ post-treatment confounder is not allowed Randomizing Mi via manipulation is not the same as assuming Mi (t) is as-if randomized

Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

13 / 47

Sequential Ignorability in the Standard Experiment

Back to Brader et al.: Treatment is randomized =⇒ (1) is satisfied But (2) may not hold: 1

2

Pre-treatment confounder or Xi : state of residence those who live in AZ tend to have higher levels of perceived harm and be opposed to immigration Post-treatment confounder: alternative mechanism beliefs about the likely negative impact of immigration makes people anxious

Pre-treatment confounders =⇒ measure and adjust for them Post-treatment confounders =⇒ adjusting is not sufficient

Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

14 / 47

Nonparametric Identification

Under SI, both ACME and average direct effects are nonparametrically identified (can be consistently estimated without modeling assumption) ¯ ACME δ(t) Z Z E(Yi | Mi , Ti = t, Xi ) {dP(Mi | Ti = 1, Xi ) − dP(Mi | Ti = 0, Xi )} dP(Xi )

¯ Average direct effects ζ(t) Z Z {E(Yi | Mi , Ti = 1, Xi ) − E(Yi | Mi , Ti = 0, Xi )} dP(Mi | Ti = t, Xi ) dP(Xi )

Implies the general mediation formula under any statistical model Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

15 / 47

Traditional Estimation Methods: LSEM Linear structural equation model (LSEM): Mi

= α2 + β2 Ti + ξ2> Xi + i2 ,

Yi

= α3 + β3 Ti + γMi + ξ3> Xi + i3 .

Fit two least squares regressions separately Use product of coefficients (βˆ2 γˆ ) to estimate ACME Use asymptotic variance to test significance (Sobel test) ¯ ¯ Under SI and the no-interaction assumption (δ(1) 6= δ(0)), βˆ2 γˆ consistently estimates ACME Can be extended to LSEM with interaction terms Problem: Only valid for the simplest LSEM

Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

16 / 47

Popular Baron-Kenny Procedure

The procedure: 1 2 3

Regress Y on T and show a significant relationship Regress M on T and show a significant relationship Regress Y on M and T , and show a significant relationship between Y and M

The problems: 1

2 3

First step can lead to false negatives especially if indirect and direct effects in opposite directions The procedure only anticipates simplest linear models Output does not generally equal an interpretable effect size

Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

17 / 47

A General Estimation Algorithm

1

Model outcome and mediator Outcome model: p(Yi | Ti , Mi , Xi ) Mediator model: p(Mi | Ti , Xi ) These models can be of any form (linear or nonlinear, semi- or nonparametric, with or without interactions)

2

Predict mediator for both treatment values (Mi (1), Mi (0))

3

Predict outcome by first setting Ti = 1 and Mi = Mi (0), and then Ti = 1 and Mi = Mi (1)

4

Compute the average difference between two outcomes to obtain a consistent estimate of ACME

5

Monte-Carlo or bootstrap to estimate uncertainty

Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

18 / 47

Example: Binary Mediator and Outcome Two logistic regression models: Pr(Mi = 1 | Ti , Xi ) = logit−1 (α2 + β2 Ti + ξ2> Xi ) Pr(Yi = 1 | Ti , Mi , Xi ) = logit−1 (α3 + β3 Ti + γMi + ξ3> Xi ) Can’t multiply β2 by γ Difference of coefficients β1 − β3 doesn’t work either Pr(Yi = 1 | Ti , Xi ) = logit−1 (α1 + β1 Ti + ξ1> Xi ) Can use our algorithm (example: E{Yi (1, Mi (0))}) 1 2

Predict Mi (0) given Ti = 0 using the first model b i (0), Xi ) using the Compute Pr(Yi (1, Mi (0)) = 1 | Ti = 1, Mi = M second model

Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

19 / 47

Sensitivity Analysis Standard experiments require sequential ignorability to identify mechanisms The sequential ignorability assumption is often too strong Need to assess the robustness of findings via sensitivity analysis Question: How large a departure from the key assumption must occur for the conclusions to no longer hold? Parametric sensitivity analysis by assuming {Yi (t 0 , m), Mi (t)} ⊥ ⊥ Ti | Xi = x but not Yi (t 0 , m) ⊥ ⊥ Mi (t) | Ti = t, Xi = x Possible existence of unobserved pre-treatment confounder Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

20 / 47

Parametric Sensitivity Analysis Sensitivity parameter: ρ ≡ Corr(i2 , i3 ) Sequential ignorability implies ρ = 0 Set ρ to different values and see how ACME changes Result: β 2 σ1 ¯ ¯ δ(0) = δ(1) = σ2



 q 2 2 ρ˜ − ρ (1 − ρ˜ )/(1 − ρ ) ,

where σj2 ≡ var(ij ) for j = 1, 2 and ρ˜ ≡ Corr(i1 , i2 ). When do my results go away completely? ¯ = 0 if and only if ρ = ρ˜ δ(t) Easy to estimate from the regression of Yi on Ti : Yi = α1 + β1 Ti + i1 Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

21 / 47

Interpreting Sensitivity Analysis with R squares Interpreting ρ: how small is too small? An unobserved (pre-treatment) confounder formulation: i2 = λ2 Ui + 0i2

and

i3 = λ3 Ui + 0i3

How much does Ui have to explain for our results to go away? Sensitivity parameters: R squares 1

Proportion of previously unexplained variance explained by Ui 2∗ RM ≡ 1−

2

var(0i2 ) var(i2 )

and RY2∗ ≡ 1 −

var(0i3 ) var(i3 )

Proportion of original variance explained by Ui 0 e 2 ≡ var(i2 ) − var(i2 ) R M var(Mi )

Teppei Yamamoto (MIT)

and

Causal Mediation Analysis

0 e 2 ≡ var(i3 ) − var(i3 ) R Y var(Yi )

March 9, 2017

22 / 47

2∗ , R 2∗ ) (or (R e2 , R e 2 )): Then reparameterize ρ using (RM Y M Y

eY eM R sgn(λ2 λ3 )R ∗ ∗ ρ = sgn(λ2 λ3 )RM , RY = q 2 )(1 − R 2 ) (1 − RM Y 2 and R 2 are from the original mediator and outcome where RM Y models

sgn(λ2 λ3 ) indicates the direction of the effects of Ui on Yi and Mi 2∗ , R 2∗ ) (or (R e2 , R e 2 )) to different values and see how Set (RM Y M Y mediation effects change

Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

23 / 47

Reanalysis of Brader et al.: Estimates under SI Original method: Product of coefficients with the Sobel test — Valid only when both models are linear w/o T –M interaction (which they are not) Our method: Calculate ACME using our general algorithm

Outcome variables Decrease Immigration ¯ δ(1) Support English Only Laws ¯ δ(1) Request Anti-Immigration Information ¯ δ(1) Send Anti-Immigration Message ¯ δ(1)

Teppei Yamamoto (MIT)

Product of Coefficients

Average Causal Mediation Effect (δ)

.347 [0.146, 0.548] .204 [0.069, 0.339] .277 [0.084, 0.469] .276 [0.102, 0.450]

.105 [0.048, 0.170] .074 [0.027, 0.132] .029 [0.007, 0.063] .086 [0.035, 0.144]

Causal Mediation Analysis

March 9, 2017

24 / 47

0.4 0.2 0.0 −0.2 −0.4

Average Mediation Effect: δ(1)

Reanalysis: Sensitivity Analysis w.r.t. ρ

−1.0

−0.5

0.0

0.5

1.0

Sensitivity Parameter: ρ

ACME > 0 as long as the error correlation is less than 0.39 (0.30 with 95% CI) Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

25 / 47

0.5 0.4

−0

.15

0.3

−0.

1

5

0.1

0.2

−0.0 0

0.0

Proportion of Total Variance in Y Explained by Confounder

˜ 2 and R ˜2 Reanalysis: Sensitivity Analysis w.r.t. R M Y

0.05

0.0

0.1

0.2 0.3 0.4 0.5 0.6 0.7 Proportion of Total Variance in M Explained by Confounder

0.8

An unobserved confounder can account for up to 26.5% of the variation in both Yi and Mi before ACME becomes zero Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

26 / 47

FAQs

What does it mean when the mediation effect has a different sign from the total effect? I don’t understand the difference between δi (0) and δi (1). Do I always have to measure the mediator before the outcome? My treatment is continuous. How do I choose values of t and t 0 ?

Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

27 / 47

Q. I got an ACME that was the opposite of the total effect, what does that mean? A. Recall the identity: Total Effect = ACME + Direct Effect. Therefore, ACME and direct effects must have opposite signs and the direct effect is larger in magnitude. E XAMPLE T = job training, Y = earnings, M = skills Suppose: Total effect < 0 and ACME > 0 It must be the case: Direct effect << 0 That is, there must be some other mechanism (e.g. time spent without searching for jobs) which is more important (quantitatively) than improved skills and makes the net impact of job training on earnings negative.

Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

28 / 47

Q. I don’t understand the difference between δi (0) and δi (1). When is one more important than the other? One can relax the so-called no interaction rule with the following model for the outcome:

Yi

= α3 + β3 Ti + γMi + κTi Mi + ξ3> Xi + i3 .

for t = 0, 1. The average causal mediation effects are given by, ¯ δ(t) = β2 (γ + κt),

Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

29 / 47

Q. I don’t understand the difference between δi (0) and δi (1). When is one more important than the other? A. The difference is which condition is considered actual and which is counterfactual. δi (0): The effect that the treatment would have had if its only action were to cause the mediator. (Actual world = control) δi (1): The effect of treatment that would be prevented if the exposure did not cause the mediator. (Actual world = treated) Oftentimes the control condition represents the “natural” state of the world or a “status quo.” In this case δi (0) may be the more relevant quantity. Epidemiologists sometimes call δi (0) the pure indirect effect for this reason.

Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

30 / 47

Q. Do I always have to measure the mediator before the outcome? A. Yes, unless you have a really good reason to believe that measuring the outcome has no effect (or only has a negligibly small effect) on the measurement of the mediator. Even if the mediator cannot be affected by the outcome conceptually, the measurement error in the mediator (which is unavoidable in most cases) can be affected by the outcome, contaminating the estimates. This is a measurement error problem much broader than mediation analysis (see Imai and Yamamoto 2010 AJPS).

Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

31 / 47

Q. My treatment is continuous. How do I choose values of t and t 0 ? A. There are several sensible ways to approach this problem: 1

If there are two values that are substantively interesting (e.g. correspond to the two most typical values in the real world), use them.

2

If the empirical distribution of the treatment is bimodal, use two values that represent the two modes.

3

If there is one value that can be regarded as a “baseline” (e.g. no treatment, natural condition), use that value as t 0 , compute multiple ACMEs by setting t to many different values, and plot the estimates against t.

4

If there is a natural “cutpoint” in the treatment values, dichotomize the treatment variable before the estimation and treat it as a binary variable (i.e. high vs. low).

Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

32 / 47

Open-Source Software “Mediation”

Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

33 / 47

Implementation Examples 1

Fit models for the mediator and outcome variable and store these models > m <- lm(Mediator ~ Treat + X) > y <- lm(Y ~ Treat + Mediator + X)

2

Mediation analysis: Feed model objects into the mediate() function. Call a summary of results > m.out <- mediate(m, y, treat = "Treat", mediator = "Mediator") > summary(m.out)

3

Sensitivity analysis: Feed the output into the medsens() function. Summarize and plot > > > >

s.out <- medsens(m.out) summary(s.out) plot(s.out, "rho") plot(s.out, "R2")

Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

34 / 47

Data Types Available via mediation

Outcome Model Types Mediator Model Types

Linear

GLM

Ordered

Censored

Quantile

GAM

Survival

Linear (lm/lmer)

X

X

X∗

X

X

X∗

X

GLM (glm/bayesglm/glmer)

X

X

X∗

X

X

X∗

X

Ordered (polr/bayespolr)

X

X

X∗

X

X

X∗

X

-

-

-

-

-

-

-

Quantile (rq)

X∗

X∗

X∗

X∗

X∗

X∗

X

GAM (gam)

X∗

X∗

X∗

X∗

X∗

X∗

X∗

X

X

X∗

X

X

X∗

X

Censored (tobit via vglm)

Survival (survreg)

Types of Models That Can be Handled by mediate. Stars (∗ ) indicate the model combinations that can only be estimated using the nonparametric bootstrap (i.e. with boot = TRUE).

Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

35 / 47

Additional Features

Treatment/mediator interactions, with formal statistical tests Treatment/mediator/pre-treatment interactions and reporting of quantities by pre-treatment values Factoral, continuous treatment variables Cluster standard errors/adjustable CI reporting/p-values Support for multiple imputation Multiple mediators Multilevel mediation A tutorial with examples: Tingley et al. (2014) (click and download).

Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

36 / 47

Beyond Sequential Ignorability Without sequential ignorability, standard experimental design lacks identification power Even the sign of ACME is not identified Need to develop alternative experimental designs for more credible inference Possible when the mediator can be directly or indirectly manipulated All proposed designs preserve the ability to estimate the ACME under the SI assumption Trade-off: statistical power These experimental designs can then be extended to natural experiments in observational studies Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

37 / 47

Parallel Design

Must assume no direct effect of manipulation on outcome More informative than standard single experiment If we assume no T –M interaction, ACME is point identified Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

38 / 47

Why Do We Need No-Interaction Assumption? Numerical Example: Prop. 0.3 0.3 0.1 0.3

Mi (1) 1 0 0 1

Mi (0) 0 0 1 1

Yi (t, 1) 0 1 0 1

Yi (t, 0) 1 0 1 0

δi (t) −1 0 1 0

¯ = −0.2 E(Mi (1) − Mi (0)) = E(Yi (t, 1) − Yi (t, 0)) = 0.2, but δ(t) The Problem: Causal effect heterogeneity T increases M only on average M increases Y only on average T − M interaction: Many of those who have a positive effect of T on M have a negative effect of M on Y (first row)

A solution: sensitivity analysis (see Imai and Yamamoto, 2013) Pitfall of “mechanism experiments” or “causal chain approach” Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

39 / 47

Encouragement Design Direct manipulation of mediator is difficult in most situations Use an instrumental variable approach:

Advantage: allows for unobserved confounder between M and Y Key Assumptions: 1 2

Z is randomized or as-if random No direct effect of Z on Y (a.k.a. exclusion restriction)

Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

40 / 47

Crossover Design

Recall ACME can be identified if we observe Yi (t 0 , Mi (t)) Get Mi (t), then switch Ti to t 0 while holding Mi = Mi (t) Crossover design: 1 2

Round 1: Conduct a standard experiment Round 2: Change the treatment to the opposite status but fix the mediator to the value observed in the first round

Very powerful – identifies mediation effects for each subject Must assume no carryover effect: Round 1 must not affect Round 2 Can be made plausible by design

Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

41 / 47

Example: Labor Market Discrimination E XAMPLE Bertrand & Mullainathan (2004, AER) Treatment: Black vs. White names on CVs Mediator: Perceived qualifications of applicants Outcome: Callback from employers Quantity of interest: Direct effects of (perceived) race Would Jamal get a callback if his name were Greg but his qualifications stayed the same? Round 1: Send Jamal’s actual CV and record the outcome Round 2: Send his CV as Greg and record the outcome Assumption: their different names do not change the perceived qualifications of applicants Under this assumption, the direct effect can be interpreted as blunt racial discrimination Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

42 / 47

Designing Observational Studies

Key difference between experimental and observational studies: treatment assignment Sequential ignorability: 1 2

Ignorability of treatment given covariates Ignorability of mediator given treatment and covariates

Both (1) and (2) are suspect in observational studies Statistical control: matching, propensity scores, etc. Search for quasi-randomized treatments: “natural” experiments How can we design observational studies? Experiments can serve as templates for observational studies

Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

43 / 47

Multiple Mediators

Quantity of interest = The average indirect effect with respect to M W represents the alternative observed mediators Left: Assumes independence between the two mechanisms Right: Allows M to be affected by the other mediators W Applied work often assumes the independence of mechanisms Under this independence assumption, one can apply the same analysis as in the single mediator case For causally dependent mediators, we must deal with the heterogeneity in the T × M interaction as done under the parallel design =⇒ sensitivity analysis Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

44 / 47

Concluding Remarks Even in a randomized experiment, a strong assumption is needed to identify causal mechanisms However, progress can be made toward this fundamental goal of scientific research with modern statistical tools A general, flexible estimation method is available once we assume sequential ignorability Sequential ignorability can be probed via sensitivity analysis More credible inferences are possible using clever experimental designs Insights from new experimental designs can be directly applied when designing observational studies Multiple mediators require additional care when they are causally dependent Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

45 / 47

Thank you! Collaborators on the project: Kosuke Imai (Princeton) Luke Keele (Georgetown) Dustin Tingley (Harvard) Kentaro Hirose (Waseda) Send your questions and comments to: [email protected]

Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

46 / 47

References (click on the article titles) General: Unpacking the Black Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies. American Political Science Review Identifying Mechanisms behind Policy Interventions via Causal Mediation Analysis. Journal of Policy Analysis and Management Theory: Identification, Inference, and Sensitivity Analysis for Causal Mediation Effects. Statistical Science Extensions: A General Approach to Causal Mediation Analysis. Psychological Methods Experimental Designs for Identifying Causal Mechanisms. Journal of the Royal Statistical Society, Series A (with discussions) Identification and Sensitivity Analysis for Multiple Causal Mechanisms: Revisiting Evidence from Framing Experiments. Political Analysis Software: mediation: R Package for Causal Mediation Analysis. Journal of Statistical Software Teppei Yamamoto (MIT)

Causal Mediation Analysis

March 9, 2017

47 / 47