Primary Submission Category: Machine Learning and Causal Inference
Methods for obtaining counterfactual predictions and quantifying associated uncertainty using observational data
Authors: Karla DiazOrdaz,
Presenting Author: Karla DiazOrdaz*
Prediction models, whether statistical or AI, are often used to help decision making. However, these approaches should not be used to answer ‘what if’ questions. Failure to recognise when the prediction estimand is causal leads to incorrect risk predictions and suboptimal treatment or policy decisions.
Our focus is counterfactual predictions, where for each individual we predict what their outcome would be under a hypothetical policy or treatment, assuming the causal structure is known, and there are no unobserved confounders.
Targeting a causal prediction estimand brings new challenges, because we can only use the observed (factual) treated sample to develop the model, but we must make predictions for the entire population. In the presence of confounding, the distribution of the factual treated may substantially differ from the target population (i.e. covariate shift). Further, we also consider situations where relevant variables are available at the model-building stage but are not available at deployment.
We review some existing methods allowing machine learning (e.g.DR-learner) and make a simpler proposal (based on inverse weighting) under covariate shift. We also implement distribution-free prediction intervals using conformal inference. We compare the methods in a simulation study and illustrate them in a real example using electronic health records to obtain counterfactual predictions for type 2 diabetes patients under different Hba1c lowering drugs choices.