Primary Submission Category: Machine Learning and Causal Inference

Stan + BART for causal inference: improved performance for heterogeneous effects

Authors: Jennifer Hill, George Perrett, Vincent Dorie, Ben Goodrich,

Presenting Author: George Perrett*

A wide range of machine-learning-based approaches to causal inference have been developed in the past decade, increasing our ability to accurately model nonlinear and non-additive response surfaces. This has improved performance for inferential tasks such as estimating average treatment effects in situations where standard parametric models may not fit the data well. These methods have also shown promise for the related task of identifying heterogeneous treatment effects. However, the estimation of both overall and heterogeneous treatment effects can be hampered when data are structured within groups if we fail to correctly model the dependence between observations. Most machine learning methods do not readily accommodate such structure. This paper introduces a new algorithm, stan4bart, that combines the flexibility of Bayesian Additive Regression Trees (BART) for fitting nonlinear response surfaces with the computational and statistical efficiencies of using Stan for the parametric components of the model. We demonstrate how stan4bart can be used to estimate average, subgroup, and individual-level treatment effects with stronger performance than other flexible approaches that ignore the multilevel structure of the data as well as multilevel approaches that have strict parametric forms.