Primary Submission Category: Machine Learning and Causal Inference
Surrogate Augmentation for Causal Inference on Censored Survival Outcomes
Authors: Yaroslav Mukhin, Tereza Oprea, Arielle Anderer, Christina Yu, Jelena Bradic,
Presenting Author: Yaroslav Mukhin*
Missing data on a long-horizon outcome is a common challenge for evaluating the impact of an intervention: Experiments are constrained by budgets and timelines; observational studies face drop-out. We show how to leverage surrogate outcomes to increase power for causal effect estimation with a right-censored survival outcome. Censoring creates a loss of information, but there is also an opportunity to regain statistical efficiency by employing auxiliary variables. E.g., in a clinical trial evaluation of the effect of a cancer treatment on survival, disease progression, or absence thereof, is informative of a censored survival time. With missing outcomes, covariates become informative for causal parameters that, absent missing data, do not depend on the latter. This result holds without structural or semiparametric restrictions, e.g., full-mediation or proportional hazards assumptions, but the size of the gain depends on the predictability of the missing outcome by the covariate. Surrogate outcomes are informed by the effect of the treatment, and allow conditioning on the information set at the time of censoring, strictly improving on the gains from baseline covariates. We derive an efficiency bound that reveals the interplay between (i) survival hazard, (ii) censoring hazard, and (iii) time-adapted forecasts of the primary event-time. We demonstrate the gains with semi-synthetic data from 93 metastatic breast cancer clinical trials.
