Primary Submission Category: Heterogeneous Treatment Effects
Combining Observational and Experimental Data to Learn Interpretable Subgroups with Heterogeneous Treatment Effects
Authors: Rahul Ladhania, Amelia Haviland,
Presenting Author: Rahul Ladhania*
In this paper, we propose and evaluate a two study approach to combine large “noisy” observational data with “clean” smaller experimental data to learn interpretable population subgroups with heterogeneous treatment effects. In the first study, we use an observational dataset, potentially with unobserved confounding, to identify sub-groups exhibiting the most distinctive treatment-outcome relationships. Our method employs a non-parametric approach, validated through a three-stage sample-splitting process to minimize overfitting and to ensure robustness. While Study 1 reduces noise, it remains susceptible to bias from unobserved confounding. Study 2 leverages an experimental design, applying the sub-group definitions learned in Study 1 to estimate treatment effects within each group, thereby testing the causal hypotheses generated in Study 1. We demonstrate the strengths and limitations of our approach through a simulation setting which varies the degree and direction of unobserved confounding. Additionally, we apply our method to data from the Women’s Health Initiative, a landmark 1991 study investigating the health effects of hormone replacement therapy on postmenopausal women.