Concept Graph & Resume using Claude 3.5 Sonnet | Chat GPT4o | Llama 3:
Resume:
1.- Susan Athey is an economics professor at Stanford who researches the intersection of economics, technology, and machine learning.
2.- Empirical work in statistics is almost equivalent to causal inference in social sciences. The goal is estimating causal effects to answer questions.
3.- Examples of causal questions include impacts of policies, advertising campaigns, auctions, and mergers on outcomes, welfare, profits.
4.- The train-test paradigm breaks for causal inference since the ground truth of each unit's outcome under different treatments is not observed.
5.- Objective in causal inference is predicting treatment effects, not just outcomes. Statistical properties of estimates are critical given no ground truth.
6.- Correlation does not equal causality, e.g. confounding occurs when link position is correlated with both link quality and click-through rates.
7.- Randomized experiments allow estimating the causal effect of link position on clicks by assigning position independently of link quality.
8.- Economic empiricists focus on identifying causal effects by finding sources of random or quasi-random variation in treatment assignment.
9.- Potential outcomes notation defines unit-level causal effects as the difference in a unit's outcome if assigned to treatment versus control.
10.- The fundamental problem of causal inference is that a unit's potential outcome is only observed under the treatment actually received, never both.
11.- Under randomization, the unobserved potential outcomes are independent of treatment assignment, allowing unbiased estimation of average treatment effects.
12.- Approaches beyond experiments include finding natural experiments, using revealed preferences to infer agent valuations, and imposing structural assumptions about behavior.
13.- Machine learning can help reduce variance in estimating average treatment effects by optimally stratifying randomization based on predicted effect heterogeneity.
14.- Causal trees modify decision trees to partition units based on heterogeneity in treatment effects rather than outcomes.
15.- Honest estimation in causal trees uses half the data for building the tree and half for estimating effects within leaves.
16.- Honest trees enable valid confidence intervals and p-values for leaf-level treatment effects with many covariates by avoiding overfitting.
17.- Causal tree cross-validation rewards effect heterogeneity and penalizes leaf variance, anticipating effect estimation in held-out data.
18.- In search experiments, causal trees found informational queries had larger effects of link position on CTR than celebrityimage queries.
19.- Honest estimation reduced the variance of effects across leaves by 2.5x versus adaptive estimation, though it has less power.
20.- Causal forests average many honest causal trees for personalized predictions, enabling asymptotically normal estimates without much loss in performance.
21.- Causal forests improve on k-nearest neighbors by only matching on covariates relevant to effect heterogeneity rather than all covariates.
22.- Unlike purely predictive tasks, lack of ground truth necessitates proving statistical properties of causal estimators to trust estimates.
23.- Machine learning methods can be adapted for causal inference by modifying objective functions and providing new avenues for valid inference.
24.- Economics is benefiting from importing ML methods, while ML can learn from the causal inference literature's focus on applications, identification and inference.
25.- Areas for collaboration include causal inference with networks, bridging short-term AB tests to long-term effects, and contextual bandits.
26.- Key assumptions include ignorability (as-if random treatment assignment), exclusion restrictions on instruments, and revealed preference for structural models.
27.- Substantive assumptions about confounding are most critical. Functional form assumptions previously common but now either explicit in Bayesian models or avoided.
28.- Online learning could enable adaptively choosing treatments to maximize information gain, with some existing methods trading off validity and performance.
29.- Causal effect estimates enable optimal policy decisions, but efficacy may decay over time, necessitating measurement of factors that could change.
30.- Understanding drivers of gaps between correlation and causation by unpacking confounding is a key exercise in empirical economics.
Knowledge Vault built byDavid Vivancos 2024