Skip to content

Abstract Search

Primary Submission Category: Heterogeneous Treatment Effects

Causal machine learning for heterogeneous treatment effects in the presence of missing outcome data

Authors: Matthew Pryce, Karla Diaz-Ordaz, Stijn Vansteelant, Ruth Keogh,

Presenting Author: Matthew Pryce*

In recent years, there has been a growing interest in exploring personalized treatment/policy decisions. However, estimating heterogenous treatment effects often requires the use of large, rich datasets, containing either high dimensional data, or complex correlations between variables. As a result, causal machine learning estimators, such as the DR-learner have grown in popularity, offering flexible and efficient tools for exploring heterogeneity.
Additionally, data often contains loss of follow up, with outcomes missing. To handle this, researchers often run an imputation model, or use inverse-probability censoring weights, reducing the robustness and efficiency of the CATE estimation process. Therefore, we propose an extension of the DR-learner which handles missing outcome data through an orthogonal implementation of inverse censoring weights. Our robust solution assumes missing at random outcome data and utilizes semi-parametric theory to derive an estimator for the CATE via a two-step process: debiasing outcome predictions via the EIF of the average treatment effect; then running a pseudo-outcome regression to obtain estimates of the CATE.
We demonstrate the utility of the approach mathematically, providing excess risk bounds, and empirically, through a simulation study and data example. We also provide a debiased MSE validation metric for the CATE, along with an extension of the approach which allows for time-varying drivers of missingness to be adjusted for.