Skip to content

Abstract Search

Primary Submission Category: Generalizability/Transportability

Completeness results for identification in selected data models for data fusion

Authors: Jaron Lee, AmirEmad Ghassami, Ilya Shpitser,

Presenting Author: Jaron Lee*

Data fusion has become an increasingly important aspect of causal inference, as combining different datasets can lead to improved identification and estimation capabilities. We consider data fusion problems where selection into different datasets is potentially systematic. We propose an explicit modeling of such systematic selection, yielding a hierarchy of selection problems analogous to the missing data hierarchy: selection completely at random (SCAR), selection at random (SAR), and selection not at random (SNAR). Importantly, it enables us to consider the task of identification and estimation in a coherent “full data distribution” that incorporates the selection mechanisms. We propose a graphical representation of the constraints in this full data model using which we provide a sound and complete identification algorithm for identification of causal effects from combinations of observational and experimental domains where the selection process and variables in these domains are potentially confounded in complicated ways.