Causal Inference for Policy Evaluation

Susan Athey

**Concept Graph & Resume using Claude 3.5 Sonnet | Chat GPT4o | Llama 3:**

graph LR
classDef main fill:#f9d4d4, font-weight:bold, font-size:14px
classDef intro fill:#d4f9d4, font-weight:bold, font-size:14px
classDef causal fill:#d4d4f9, font-weight:bold, font-size:14px
classDef ml fill:#f9f9d4, font-weight:bold, font-size:14px
classDef methods fill:#f9d4f9, font-weight:bold, font-size:14px
classDef future fill:#d4f9f9, font-weight:bold, font-size:14px
Main[Causal Inference for

Policy Evaluation] Main --> A[Introduction to Causal Inference] A --> A1[Athey: economics, technology, machine

learning researcher 1] A --> A2[Causal inference estimates effects,

answers questions 2] A --> A3[Causal questions: policy, advertising,

auction impacts 3] A --> A4[Train-test paradigm fails for

causal inference 4] A --> A5[Predicting treatment effects, not

just outcomes 5] Main --> B[Causal Inference Concepts] B --> B1[Correlation ≠ causality, confounding

example explained 6] B --> B2[Randomized experiments estimate causal

link effects 7] B --> B3[Empiricists find random treatment

assignment variation 8] B --> B4[Potential outcomes define unit-level

causal effects 9] B --> B5[Fundamental problem: unobserved potential

outcomes 10] B --> B6[Randomization enables unbiased effect

estimation 11] Main --> C[Machine Learning in Causal Inference] C --> C1[ML reduces variance in

effect estimation 13] C --> C2[Causal trees partition by

effect heterogeneity 14] C --> C3[Honest estimation: split data

for tree/effects 15] C --> C4[Honest trees enable valid

statistical inference 16] C --> C5[Causal tree cross-validation rewards

effect heterogeneity 17] C --> C6[Causal forests average trees

for personalized predictions 20] Main --> D[Methods and Applications] D --> D1[Approaches: natural experiments, revealed

preferences 12] D --> D2[Search experiment: query type

affects CTR 18] D --> D3[Honest estimation reduces variance

across leaves 19] D --> D4[Causal forests improve on

k-nearest neighbors 21] D --> D5[Proving statistical properties crucial

for causality 22] D --> D6[ML methods adapted for

causal inference 23] Main --> E[Future Directions and Challenges] E --> E1[Economics and ML benefit

from collaboration 24] E --> E2[Collaboration areas: networks, long-term

effects 25] E --> E3[Key assumptions: ignorability, exclusion

restrictions 26] E --> E4[Substantive confounding assumptions most

critical 27] E --> E5[Online learning maximizes information

gain 28] E --> E6[Causal estimates enable optimal

policy decisions 29] class Main main class A,A1,A2,A3,A4,A5 intro class B,B1,B2,B3,B4,B5,B6 causal class C,C1,C2,C3,C4,C5,C6 ml class D,D1,D2,D3,D4,D5,D6 methods class E,E1,E2,E3,E4,E5,E6 future

Policy Evaluation] Main --> A[Introduction to Causal Inference] A --> A1[Athey: economics, technology, machine

learning researcher 1] A --> A2[Causal inference estimates effects,

answers questions 2] A --> A3[Causal questions: policy, advertising,

auction impacts 3] A --> A4[Train-test paradigm fails for

causal inference 4] A --> A5[Predicting treatment effects, not

just outcomes 5] Main --> B[Causal Inference Concepts] B --> B1[Correlation ≠ causality, confounding

example explained 6] B --> B2[Randomized experiments estimate causal

link effects 7] B --> B3[Empiricists find random treatment

assignment variation 8] B --> B4[Potential outcomes define unit-level

causal effects 9] B --> B5[Fundamental problem: unobserved potential

outcomes 10] B --> B6[Randomization enables unbiased effect

estimation 11] Main --> C[Machine Learning in Causal Inference] C --> C1[ML reduces variance in

effect estimation 13] C --> C2[Causal trees partition by

effect heterogeneity 14] C --> C3[Honest estimation: split data

for tree/effects 15] C --> C4[Honest trees enable valid

statistical inference 16] C --> C5[Causal tree cross-validation rewards

effect heterogeneity 17] C --> C6[Causal forests average trees

for personalized predictions 20] Main --> D[Methods and Applications] D --> D1[Approaches: natural experiments, revealed

preferences 12] D --> D2[Search experiment: query type

affects CTR 18] D --> D3[Honest estimation reduces variance

across leaves 19] D --> D4[Causal forests improve on

k-nearest neighbors 21] D --> D5[Proving statistical properties crucial

for causality 22] D --> D6[ML methods adapted for

causal inference 23] Main --> E[Future Directions and Challenges] E --> E1[Economics and ML benefit

from collaboration 24] E --> E2[Collaboration areas: networks, long-term

effects 25] E --> E3[Key assumptions: ignorability, exclusion

restrictions 26] E --> E4[Substantive confounding assumptions most

critical 27] E --> E5[Online learning maximizes information

gain 28] E --> E6[Causal estimates enable optimal

policy decisions 29] class Main main class A,A1,A2,A3,A4,A5 intro class B,B1,B2,B3,B4,B5,B6 causal class C,C1,C2,C3,C4,C5,C6 ml class D,D1,D2,D3,D4,D5,D6 methods class E,E1,E2,E3,E4,E5,E6 future

**Resume: **

**1.-** Susan Athey is an economics professor at Stanford who researches the intersection of economics, technology, and machine learning.

**2.-** Empirical work in statistics is almost equivalent to causal inference in social sciences. The goal is estimating causal effects to answer questions.

**3.-** Examples of causal questions include impacts of policies, advertising campaigns, auctions, and mergers on outcomes, welfare, profits.

**4.-** The train-test paradigm breaks for causal inference since the ground truth of each unit's outcome under different treatments is not observed.

**5.-** Objective in causal inference is predicting treatment effects, not just outcomes. Statistical properties of estimates are critical given no ground truth.

**6.-** Correlation does not equal causality, e.g. confounding occurs when link position is correlated with both link quality and click-through rates.

**7.-** Randomized experiments allow estimating the causal effect of link position on clicks by assigning position independently of link quality.

**8.-** Economic empiricists focus on identifying causal effects by finding sources of random or quasi-random variation in treatment assignment.

**9.-** Potential outcomes notation defines unit-level causal effects as the difference in a unit's outcome if assigned to treatment versus control.

**10.-** The fundamental problem of causal inference is that a unit's potential outcome is only observed under the treatment actually received, never both.

**11.-** Under randomization, the unobserved potential outcomes are independent of treatment assignment, allowing unbiased estimation of average treatment effects.

**12.-** Approaches beyond experiments include finding natural experiments, using revealed preferences to infer agent valuations, and imposing structural assumptions about behavior.

**13.-** Machine learning can help reduce variance in estimating average treatment effects by optimally stratifying randomization based on predicted effect heterogeneity.

**14.-** Causal trees modify decision trees to partition units based on heterogeneity in treatment effects rather than outcomes.

**15.-** Honest estimation in causal trees uses half the data for building the tree and half for estimating effects within leaves.

**16.-** Honest trees enable valid confidence intervals and p-values for leaf-level treatment effects with many covariates by avoiding overfitting.

**17.-** Causal tree cross-validation rewards effect heterogeneity and penalizes leaf variance, anticipating effect estimation in held-out data.

**18.-** In search experiments, causal trees found informational queries had larger effects of link position on CTR than celebrityimage queries.

**19.-** Honest estimation reduced the variance of effects across leaves by 2.5x versus adaptive estimation, though it has less power.

**20.-** Causal forests average many honest causal trees for personalized predictions, enabling asymptotically normal estimates without much loss in performance.

**21.-** Causal forests improve on k-nearest neighbors by only matching on covariates relevant to effect heterogeneity rather than all covariates.

**22.-** Unlike purely predictive tasks, lack of ground truth necessitates proving statistical properties of causal estimators to trust estimates.

**23.-** Machine learning methods can be adapted for causal inference by modifying objective functions and providing new avenues for valid inference.

**24.-** Economics is benefiting from importing ML methods, while ML can learn from the causal inference literature's focus on applications, identification and inference.

**25.-** Areas for collaboration include causal inference with networks, bridging short-term AB tests to long-term effects, and contextual bandits.

**26.-** Key assumptions include ignorability (as-if random treatment assignment), exclusion restrictions on instruments, and revealed preference for structural models.

**27.-** Substantive assumptions about confounding are most critical. Functional form assumptions previously common but now either explicit in Bayesian models or avoided.

**28.-** Online learning could enable adaptively choosing treatments to maximize information gain, with some existing methods trading off validity and performance.

**29.-** Causal effect estimates enable optimal policy decisions, but efficacy may decay over time, necessitating measurement of factors that could change.

**30.-** Understanding drivers of gaps between correlation and causation by unpacking confounding is a key exercise in empirical economics.

Knowledge Vault built byDavid Vivancos 2024