This piece explores the many kinds of quantitative claims that researchers and commentators regularly make about race and policing. Everyone agrees that there are enormous racial gaps in U.S. rates of stops, arrests, searches, and use of force. But there are dramatically conflicting claims as to why. Policing is hard to study, but the problem isn‟t just the data shortcomings with which the literature has long struggled. It‟s confusion about what questions we should be asking. Different kinds of numerical comparisons and research designs often imply sharply differing conceptions of what racial equality in policing means. These normative premises often go unstated, such that readers may easily miss these differences. The overarching objective of this Article is to highlight the connection between the normative and the empirical. I identify plausible conceptions of racial equality in policing and assess which empirical methods can best test those conceptions. The Article gives particular attention to how researchers should address two important research questions. The first is whether criminal conduct differences explain policing disparities. Empirical researchers as well as casual commentators typically purport to address this question either by comparing racial groups‟ shares of police interactions to their shares of crime, or by comparing two groups‟ ratio of police interactions to their ratio of crimes. Using examples and mathematical proofs, I show that neither of these comparison types answers the key question whether people with like criminal conduct are being treated the same way. These comparisons generally overcorrect for racial differences in criminal conduct, misleadingly masking the size (or even reversing the apparent direction) of disparities in policing of people with the same conduct. Second, I examine how researchers should investigate the effects of racial discrimination—a morally important and legally central question, but one that poses serious causal inference challenges. I review several methods in the current literature, which offer useful insights but have substantial limitations, and critique the recently dominant “hit-rate” approach, which relies on faulty normative and empirical premises. Instead, I propose supplementing existing tools with a new approach: the use of “testers.”



