Primary Submission Category: Causal Discovery
Causal learning with unknown interventions: algorithms and guarantees
Authors: Armeen Taeb, Juan Gamella, Peter Buehlmann, Christina Heinze-Deml, Felix Hafenmair,
Presenting Author: Armeen Taeb*
With observational data alone, causal inference is a challenging problem. The task becomes easier when having access to data collected from perturbations of the underlying system, even when the nature of these is unknown. In this talk, we will describe methods that use such perturbation data to identify plausible causal mechanisms. Specifically, in the context of Gaussian linear structural equation models, we first characterize the interventional equivalence class of DAGs. We then leverage these results to study high-dimensional consistency guarantees of a l0-penalized maximum likelihood estimator for learning said class. Since solving this estimator is generally intractable, we design a procedure called GnIES which proceeds greedily in the space of interventional equivalent models. In addition, we develop a novel procedure to generate semi-synthetic data sets with known causal ground truth but distributions closely resembling those of a real data set of choice. We leverage this procedure and evaluate the performance of GnIES on synthetic, real, and semi-synthetic data sets. Despite the strong Gaussian distributional assumption, GnIES is robust to an array of model violations and competitive in recovering the causal structure in small- to large-sample settings. We provide, in the Python packages emph{gnies} and emph{sempler}, implementations of GnIES and our semi-synthetic data generation procedure.