Primary Submission Category: Heterogeneous Treatment Effects
A systematic comparison of machine learning methods for estimating heterogeneous treatment effects in large-scale randomized trials
Authors: Pei Zhu, Luke Miratrix, Richard Dorsett, David Selby, Polina Polskaia, Nicholas Commins,
Presenting Author: David Selby*
Analysts often seek to understand how treatment effects vary across individuals, and machine learning offers a flexible framework for exploring this heterogeneity. Despite various proposed methods, there is limited guidance on which approach to use in experimental evaluations. Our study investigates the robustness of these techniques against diverse impact-generating mechanisms, ranging from simple to complex scenarios, emphasizing the need for methods that perform well across various situations.
Using an empirical Monte Carlo approach, we analyze two US education trials, generating datasets with covariates and untreated outcomes similar to the originals before applying treatment effects to yield outcomes of the treatment group. We evaluate various methods in repeated train-test scenarios to assess their ability to capture impact heterogeneity against various impact-generating processes.
Our findings reveal that, in the absence of impact variation, all estimators show low bias, with regularized methods being more precise. For straightforward heterogeneity, most methods outperform the average treatment effect (ATE), with stable Lasso models typically achieving lower root mean square error (RMSE). With complex heterogeneity, tree-based methods and double machine learning approaches yield lower bias. Generally, methods optimally capture heterogeneity similar to their design, though the Causal Forest is notably strong across contexts.