Primary Submission Category: Generalizability/Transportability
Data Fusion with Distributional Equivalence Test-then-pool
Authors: Linying Yang, Xing Liu, Robin Evans,
Presenting Author: Linying Yang*
Randomized controlled trials (RCTs) are the gold standard for causal inference, but their high costs and recruitment challenges often limit the feasibility of fully powered studies. A common remedy is to supplement the current control arm with data from historical trials. However, naively pooling external controls can introduce bias when the populations differ. Existing test-then-pool (TTP) approaches attempt to guard against this risk, but standard implementations are prone to both loss of power and uncontrolled bias when distributions differ.
We propose a new TTP framework that fuses control arms while rigorously controlling the Type-I error rate of the final treatment effect test. Our method leverages kernel two-sample testing via maximum mean discrepancy (MMD) to capture distributional differences, and equivalence testing to avoid introducing bias due to under-powered fusion test, providing a more flexible and informative criterion for pooling. To ensure valid inference, we introduce partial bootstrap and partial permutation procedures for approximating null distributions in the presence of heterogeneous controls. We further establish the overall validity of the fused treatment effect test and provide guidance on selecting equivalence margins to balance power and robustness.
