Primary Submission Category: Matching, Weighting
Development and Evaluation of Ensemble Propensity Score Matching: A Comparison with the Covariate Balancing Propensity Score
Authors: Yasutaka Hasegawa, Takanobu Osaki, Hideyuki Ban, Takayuki Arai, Tsutomu Kikuchi,
Presenting Author: Yasutaka Hasegawa*
Covariate balance is essential for estimating the average treatment effect on the treated (ATT) from observational data. Propensity score matching (PSM) is widely used, but reliance on a single propensity score (PS) model can be brittle under model misspecification. We propose Ensemble Propensity Score Matching (Ensemble PSM), which fits multiple PS estimators and selects the matched sample that minimizes overall imbalance. We estimate PS using logistic regression, LASSO, elastic net, gradient-boosted trees, and a neural network; for each PS we create 1:1 caliper matches without replacement, compute standardized mean differences (SMDs) across all covariates, and select the candidate match with the smallest mean SMD. We evaluated the method using a real-world health guidance dataset (treated n=3,574; controls n=9,668; 31 covariates) and benchmarked it against CBPS-PSM (PS estimated via the covariate balancing propensity score, then matched under the same 1:1 caliper-without-replacement design) targeting the ATT. Ensemble PSM achieved a mean SMD of 0.00266 and a maximum SMD of 0.00754 versus 0.00796 and 0.02381 for CBPS-PSM, corresponding to 66.6% and 68.3% reductions, respectively. Ensemble PSM also outperformed single-model PSM baselines (e.g., logistic-regression PSM: 0.01123/0.02775). These findings suggest that balance-driven ensemble selection can improve the robustness of covariate adjustment for causal effect estimation in observational health service evaluations.
