Primary Submission Category: Machine Learning and Causal Inference
Minimax optimal counterfactual density estimation
Authors: Edward Kennedy,
Presenting Author: Edward Kennedy*
Causal effects are often characterized with averages – but these can give an incomplete picture of the underlying counterfactual distribution, e.g., when treatment mostly affects spread or other more complex distributional features, beyond the mean. Therefore in this work we consider estimating the entire counterfactual density. We derive the minimax rate for counterfactual density estimation, in a nonparametric model where distributional components are Holder-smooth, and present several new estimators, giving high-level conditions under which they are minimax optimal. Importantly, our minimax results are derived via a localized version of the method of fuzzy hypotheses, combining lower bound constructions for nonparametric regression and functional estimation (thus providing connections to heterogeneous effect estimation). The minimax rate we find exhibits several interesting features, including a non-standard elbow phenomenon and an unusual interpolation between nonparametric regression and functional estimation rates. We illustrate our methods by estimating the density of CD4 count among patients with HIV, had all been treated with combination therapy versus zidovudine alone. Our results yield the practically important conclusion that combination therapy may have increased CD4 count most for high-risk patients.