Skip to content

Abstract Search

Primary Submission Category: Causal inference with high-dimensional covariates

Causal Inference with High-dimensional Discrete Covariates

Authors: Zhenghao Zeng, Edward Kennedy, Sivaraman Balakrishnan, Yanjun Han,

Presenting Author: Zhenghao Zeng*

When estimating causal effects from observational studies, covariate adjustment is often required to deconfound the non-causal relationship between exposure and outcome (i.e., association). For modern datasets featuring an increasing number of covariates, researchers may have access to discrete covariates (with potentially a large number of categories), where commonly assumed structures such as smoothness fail to hold and the behavior of popular regression, weighting and doubly robust estimators has not been well-understood. In this work, we study estimation of the causal effect in a model where the covariates required for confounding adjustment are discrete but high-dimensional, meaning the number of categories diverges. Specifically, we study the theoretical properties of commonly used estimators mentioned above and provide sufficient and necessary conditions for them to be consistent. We also consider additional structures that can be exploited, namely effect homogeneity and prior knowledge on covariate distribution, and propose new estimators that enjoy faster convergence rate and achieve consistency in a broader regime. The results are illustrated empirically via simulation studies. Importantly, we also derive minimax lower bound of the average treatment effects, which characterizes the fundamental difficulty of causal effects estimation in high-dimensional discrete setting.