Primary Submission Category: Heterogeneous Treatment Effects
Constructing influence sets for heterogeneous treatment effect models
Authors: Melody Huang, Ana Kenney, Tiffany Tang,
Presenting Author: Melody Huang*
Evaluating the performance of heterogeneous treatment effect (HTE) models (i.e., metalearners) is inherently challenging. Unlike standard supervised learning problems, the prediction target–the conditional average treatment effect–is unobservable. As a result, existing model evaluation tools (i.e., omnibus goodness-of-fit tests, or relative error measures) must rely on indirect diagnostics; however, these approaches can highly sensitive to certain observations in a study. In the following paper, we propose a novel framework for identifying influence sets: subsets of observations whose removal induces the largest change in an HTE model’s predictive behavior. Our approach extends classical influence function methodology to account for dependence across observations, enabling the exact characterization of the impact of removing multiple points simultaneously. This extension generalizes to complex metalearners requiring multi-stage estimation. We show that identifying the most influential subset can be formulated as a mixed integer quadratic optimization problem, yielding global optimality guarantees for the resulting influence set. Moreover, our approach is general and can be adapted to other metrics of interest beyond shifts in predictive behavior. We illustrate the utility and flexibility of our approach in a case study evaluating the impact of cash transfer program.
