Primary Submission Category: Machine Learning and Causal Inference
Super Efficient Estimation for a Sieve of Statistical Models
Authors: Ivana Malenica, Mark van der Laan,
Presenting Author: Ivana Malenica*
There is an increasing interest in estimating causal effects in dependent settings, where dependence is prevalent across time and/or samples. In order to make progress, assumptions on the statistical model are common: usually in the form of known network or Markov order, or conditional independence. In this work, we propose a sieve-based approach for data-adaptively picking a statistical model from an initial fully nonparametric space. We are concerned with statistical inference for a pathwise differentiable target parameter based on t time-points (or n samples), where there is enough independence that the canonical gradient is asymptotically normal. One example considered is estimation of the causal effect of an intervention at time t on the proximal outcome given the past, averaged over all observed times. We consider a sequence of nested models indexed by a multivariate real-valued parameter that approximates the actual statistical model. We propose a data-adaptive selector of the index, and estimate the target parameter with the corresponding targeted minimum loss estimator (TMLE). We show that, under regularity conditions, the proposed adaptive TMLE is asymptotically normal and super-efficient. This provides an important alternative to the TMLE for the actual statistical model which might be too large to be informative, but can be captured by a smaller model in a sieve, as is often the case in structured dependent settings.