Primary Submission Category: Design-Based Causal Inference
Exact Fisherian P-Values for Multi-Armed Bandits
Authors: Adam Sales, Ethan Prihar,
Presenting Author: Adam Sales*
Randomized multi-armed-bandits experimental designs in which successive subjects are adaptively randomized subjects between conditions: randomization probabilities are based on previous users’ outcomes. One heuristic, Thompson sampling, essentially randomizes users to conditions according to the posterior probabilities that each condition is optimal, conditional on previous users’ responses. However, from a scientific perspective, RMABs present a challenge: the adaptivity of RMABs induces a complex dependence structure between the observations which invalidates usual approaches to statistical inference from randomized experiments, which assume some degree of independence between observations.
This paper illustrates a simple, but overlooked, solution: simulation-based exact p-values for Fisher’s strict null hypothesis of no effect. This approach requires a complete record of randomization probabilities, treatment assignments, and outcomes—and sufficient computational power—but little else. We illustrate this approach using a new dataset of over 200 RMABs conducted on an online homework platform, where the outcome of interest was students’ correctness on the next problem, and compare randomization-based p-values using a variety of test statistics to naïve chi-squared tests.
