Primary Submission Category: Machine Learning and Causal Inference
Few-shot causal learning for new treatments and outcomes using task embeddings
Authors: Sophie Woodward, James Kitch, Claudio Battiloro, Mauricio Tec, Francesca Dominici,
Presenting Author: Sophie Woodward*
Estimating heterogeneous treatment effects is a central problem in causal inference, with applications in personalized medicine, public policy, and online marketing. Existing methods focus on predicting the effects of fixed treatments on fixed outcomes and do not address settings in which new treatments or new outcomes are introduced. In many such settings, a small amount of data from the new treatment-outcome pair may be available, for example from an early-phase clinical trial. We study the problem of estimating the conditional average treatment effect (CATE) for a new treatment-outcome pair, given limited data and borrowing information from previously observed treatments and outcomes. Specifically, we view CATE estimation for each treatment-outcome pair as a task, and propose a framework that uses task embeddings—vector representations that encode structural or semantic relationships across treatments and outcomes—to predict the CATE function across tasks. We subsequently estimate the CATE for the new task by combining the embedding-based CATE predictor learned across tasks with a CATE estimator fit using data from the new task alone. This yields a data-fusion estimator that can reduce variance relative to task-only estimation under some regularity conditions. Experiments on semi-synthetic benchmarks and large-scale medical claims data evaluate performance and illustrate the roles of covariate shift, number of tasks, task sample size, and embedding distance.
