Primary Submission Category: Applicants in Social Sciences
Scalable Causal Inference in Marketing Mix Modeling: An Automated Double Machine Learning Pipeline with Continuous Treatments and Domain-Informed Constraints
Authors: Hirotoshi Nakahara, Martin Spindler,
Presenting Author: Hirotoshi Nakahara*
In marketing, Marketing Mix Modeling (MMM) is essential for privacy-compliant media measurement. However, traditional approaches often suffer from functional misspecification and subjective tuning. This study presents an automated causal inference pipeline for MMM, leveraging Double Machine Learning (DML) to estimate the impact of continuous media treatments with Neyman orthogonality to mitigate regularization bias.
Our framework introduces an AutoML-based selection process for nuisance parameters (outcome and treatment models), exploring various algorithms such as XGBoost, LightGBM, and Random Forest. To identify the optimal DML architecture, we utilize a “combined loss” metric, defined as the product of the treatment model RMSE and the sum of the treatment and outcome model RMSEs. This criterion prioritizes models that achieve high-quality identification of the treatment assignment mechanism alongside predictive accuracy.
Furthermore, we incorporate domain-informed constraints: (1) a non-negativity filter on estimated causal effects to align with marketing priors, and (2) a stability-based selection that minimizes the total variance of Conditional Average Treatment Effect (CATE) estimates across iterations. Validated on semi-synthetic data with complex multi-collinearity and a real-world retail case study, our pipeline demonstrates superior accuracy and stability over conventional MMMs.
