Primary Submission Category: Causal Inference in Networks
Design and analysis for valid causal inference with Network-dependent data
Authors: Zhejia Dong, Youjin Lee,
Presenting Author: Zhejia Dong*
Matching is widely used to mimic randomized experiments by forming matched sets in which treated and control units differ only randomly with respect to important confounding variables. However, when the study population consists of interconnected units from a single network or a small number of networks, matching solely on confounding variables may produce matched units that are not randomly different with respect to their network distance, but instead are more likely to be closely connected after matching. Such increased network closeness within matched sets may induce spurious associations between treatment and outcome, when both variables exhibit shared autocorrelation patterns on the network. To reduce spurious associations within matched sets while preserving the validity of within-matched-set causal comparisons, we propose a new matching method that matches units with similar covariates while reducing within-matched-set dependence by imposing additional constraints on network proximity. Furthermore, at the analysis stage, to account for residual dependence across matched sets, we propose a valid randomization inference procedure for testing the sharp null hypothesis of no causal effect that accommodates across-matched-set dependence without explicit assumptions on the underlying dependence structure. We demonstrate the validity and utility of the proposed methods through simulation studies and an application to real-world HIV transmission network data.
