Primary Submission Category: Estimation under conditional ignorability
Regression’s weighting problem: a new analysis and simple fixes
Authors: Tanvi Shinkre, Chad Hazlett,
Presenting Author: Tanvi Shinkre*
Researchers often use regression to estimate causal effects. Regressing the outcome of interest (Y) on a binary treatment indicator (D) without covariates (X) produces a coefficient equal to the difference in means and unbiased for the average treatment effect (ATE). More often, covariates are included to adjust for confounding. It is now well known that the resulting coefficient produces a weighted average of treatment effects across covariate strata, not the ATE. To address this, some researchers propose diagnostics to quantify the severity of this “weighting problem” while others suggest alternative estimation strategies. After reviewing the literature, we develop a new expression for these weights, required to recover the correct weights when D is not linear in X. We then compare a number of estimators and the assumptions under which they can recover the ATE. Central to the analysis is the recognition that multiple alternatives–including regression-imputation (g-computation) and interacting the treatment with covariates (Lin 2013)–can be motivated by assuming the treatment and non-treatment potential outcomes are separately linear in X. Further, methods sharing this justification are indeed equivalent, producing the same estimates–all of which side-step the weighting problem. Beyond providing this theoretical clarity, we recommend the simple solution of interacting D with X to solve this problem with minimal change to existing practice.