Primary Submission Category: Matching, Weighting
Weighting-based Identification and Estimation Techniques in Graphical Models of Missing Data
Authors: Anna Guo, Razieh Nabi,
Presenting Author: Anna Guo*
In this paper, we propose a novel algorithm to identify complete data distributions in graphical models of missing data, without imposing any restrictions on the complete data distribution and only requiring the missingness mechanisms to factorize according to a conditional directed acyclic graph.
Our view aligns with prior work on missing data that frames identification using causal graphical models with hidden variables, where missingness indicators are viewed as “treatments” that could potentially be intervened on.
Selection bias is the primary obstacle to identification in missing data models under the interventionist perspective. To address this, our identification algorithm generates a tree data structure that facilitates tracking selection bias and provides insight into how it can be avoided. Building on this framework, we develop recursive weighting strategies for estimating missingness mechanisms and for conducting statistical analyses of the complete data law, extending inverse probability weighting methods to missing-not-at-random settings. We demonstrate the effectiveness of our approach through simulation studies, comparing it with classical methods such as multiple imputation and the EM algorithm across a range of analysis tasks. An accompanying R package, flexMissing, implements all proposed procedures.
