Primary Submission Category: Machine Learning and Causal Inference
Markov-Blanket-Guided training for Tabular Foundation Models
Authors: Shu Wan, Abhinav Gorantla, Kasim Selcuk Candan, Huan Liu,
Presenting Author: Shu Wan*
The Markov blanket of a target variable constitutes the minimal and information-theoretically optimal feature set for prediction in a directed acyclic graph (DAG). Despite its central role in causal discovery and probabilistic graphical models, modern predictive systems rarely leverage Markov blanket structure explicitly during training. Meanwhile, tabular foundation models such as TabPFN introduce a new paradigm for supervised learning by training on large corpora of synthetically generated datasets derived from random DAGs. This generative perspective creates an opportunity to incorporate structural signals directly into the learning process.
In this work, we propose MB-Guide, a Markov-Blanket-Guided training strategy for tabular foundation models. Instead of treating all features symmetrically, MB-Guide uses the Markov blanket of the target variable, available by construction in synthetic DAG-based data generation, as a structural supervision signal during training. The model is encouraged to prioritize blanket variables while suppressing irrelevant ones, aligning representation learning with the theoretically optimal predictive set.
Empirical results demonstrate several desirable properties. First, MB-guided training improves computational efficiency by reducing effective feature redundancy during learning. Second, models trained with MB-Guide exhibit a strong ability to recover the target’s Markov blanket at inference time, enhancing structural interpretability.
